Univariant Extrinsic Initiator Control System for Microbes and an In Vitro Assembly of Large Recombinant DNA Molecules From Multiple Components

ABSTRACT

The invention provides, inter alia, a nucleic acid (e.g. expression vector) that comprises at least a first coding sequence and a second coding sequence. Each conding sequence is under the control of an inducible promoter of defined strength. Different promoters can have different strengths. Each promoter is responsive to the same inducer. The invention also provides: methods of expressing coding regions, methods of making a product of a multi-enzyme pathway, and methods of optimizing the yield of a product of a multi-enzyme metabolic pathway using the nucleic acids provided by the invention. Also disclosed is a method of non-enzymatic gene cloning useful for practicing the invention.

RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application No.61/726,795, filed on Nov. 15, 2012. The entire teachings of the aboveapplication are incorporated herein by reference.

BACKGROUND OF THE INVENTION

Most metabolic pathways are not restricted by a single rate-limitingstep. To exploit a pathway for the production of metabolites willrequire the optimal expression of several enzymes in tightly coordinatedmanner. Failure to do so will invariably result in undue metabolicburden where metabolic imbalance can lead to the accumulation ofintermediate metabolites or gene products with potential cytotoxicityor, in some cases, may affect normal cell growth. Thus, a significantchallange to produce compounds, such as pharmaceutical products or theirprecursors, using microbial cells as biofactories is to optimizeexpression of multiple enzymes participating in a certain pathway.

A number of tools are currently available to allow the fine modulationof gene expression in a pathway. This include methods for generatingrandomized genetic knockouts and overexpression libraries, syntheticpromoter libraries, tunable intergenic regions, and global techniques(e.g., artificial transcription factor engineering, ribosomeengineering, global transcription machinery engineering, and genomeshuffling).

Despite the availability of these tools, simultaneous optimization ofthe expression of a number of genes in a pathway is still highlyempirical, unpredictable and time consuming. Currently, there is no wayof knowing if an optimal is achieved by tuning with the existing toolsand methods, making these highly unsatisfactory. Hence, a tacit demand,yet to be met, is a reliable method to enable the tuning of theexpression of multiple genes in a single cassette with predictableoptima.

SUMMARY OF THE INVENTION

In one embodiment, the present invention provide expression vectors. Theexpression vectopr comprises at least a first coding region and a secondcoding region. The first coding region encodes at least a first geneproduct, the first coding region being operably linked to a firstinducible promoter, the first inducible promoter being of a firststrength and being responsive to an inducer. The second coding regionencodes at least a second gene product, the second coding region beingoperably linked to a second inducible promoter, the second induciblepromoter being of a second strength, different from the first strength,and being responsive to the inducer.

In another embodiment, the present invention provides kits that compriseat least two expression vectors. The first expression vector comprises acoding region encoding at least a first gene product, the coding regionbeing operably linked to a first inducible promoter, the first induciblepromoter being of a first strength and being responsive to an inducer.The second expression vector comprises a coding region encoding at leasta second gene product, the coding region being operably linked to asecond inducible promoter, the second inducible promoter being of asecond strength, different from the first strength, and being responsiveto the inducer.

In another embodiment, the present invention provides methods ofexpressing at least a first coding region and a second coding region ina cell. The method comprises providing an expression vector comprisingat least the first coding region and the second coding region. The firstcoding region is operably linked to a first inducible promoter, thefirst inducible promoter being of a first strength and being responsiveto an inducer. The second coding region is operably linked to a secondinducible promoter, the second inducible promoter being of a secondstrength, different from the first strength, and being responsive to theinducer.

In another embodiment, the present invention provides methods ofexpressing at least a first coding region and a second coding region ina cell. The method comprises providing at least a first expressionvector comprising at least the first coding region encoding a first geneproduct, and at least a second expression vector comprising at least thesecond coding region coding region encoding a second gene product. Thefirst coding region is operably linked to a first inducible promoter,the first inducible promoter being of a first strength and beingresponsive to an inducer. The second coding region is operably linked toa second inducible promoter, the second inducible promoter being of asecond strength, different from the first strength, and being responsiveto the inducer.

In another embodiment, the present invention provides methods ofoptimizing yield of a product of a multi-step enzymatic pathway in ahost cell. The multi-step enzymatic pathway including at least a firstreaction catalyzed by a first enzyme, and a second reaction catalyzed bythe second enzyme. The method comprises determining optimal levels ofexpression of the first and the second enzymes, determining the ratio ofa strength of a first inducible promoter to a strength of a secondinducible promoter, the ratio of the strengths corresponding to theoptimal levels of expression of the first and the second enzymes, thefirst and the second promoters being responsive to the same inducer; andconstructing an expression vector. The expression vector comprises afirst coding region encoding the first enzyme, the first coding regionbeing operably linked to the first inducible promoter, and a secondcoding region encoding the second enzyme, the second coding region beingoperably linked to the second inducible promoter.

In another embodiment, the present invention provides methods of genecloning. The method comprises contacting each of a vector and a set ofinserts, the set of inserts including at least a first coding region anda second coding region, with a pair of first terminal primers, a pair ofsecond terminal primers, and at least one pair of linking primers. Eachof the first terminal primers includes a first region complementary tothe vector and a second region complementary to a first insert in theset of inserts, each of the second terminal primers includes a firstregion complementary to the vector and a second region complementary toan insert different from the first insert, each of the linking primersincludes a first region complementary to an insert in the set of insertsand a second region complementary to a different ert in the set ofinserts. Each primer includes at least one phosphorothioateinternucleotide linkage. The method further includes amplifying thevector and at least two nserts to produce a vector amplification productand at least two sert amplification products, each including at leastone phosphorothioate internucleotide linkage; non-enzymatically cleavingthe vector amplification product and the at least two insertamplification products at the at least one phosphorothioateinternucleotide linkage to produce complementary single-strandedoverhangs; annealing the vector amplification product and the at leasttwo insert amplification products and thereby non-enzymaticallyassembling a transforming product; and, in some embodiments, furthercomprising introducing the transforming product into a host cell.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing will be apparent from the following more particulardescription of example embodiments of the invention, as illustrated inthe accompanying drawings in which like reference characters refer tothe same parts throughout the different views. The drawings are notnecessarily to scale, emphasis instead being placed upon illustratingembodiments of the present invention.

FIG. 1 illustrates isoprenoid production pathways. Pathways for theproduction of isoprenoid (Amorphadiene or Lycopene): the DXP pathway(top row; dxs to idi), MVA pathway (bottom row, from hmgS to MVD),terpenoid synthesis pathway (ADS, crtE, crtI, crtB) and other E. colinative genes (remaining genes). A solid arrow represents a singleenzymatic step, while a dashed arrow represents multiple enzymaticsteps. The overexpressed pathway modules are listed in boxes (SIDF, ADS,crtEBI, SBR, KKDJ, AA). Key metabolites are with white boxes.Abbreviations for metabolites: GA3P: glyceraldehyde 3-phosphate, IPP:Isopentenyl pyrophosphate, DMAPP: Dimethylallyl pyrophosphate, GPP:Geranyl diphosphate, FPP: Farnesyl diphosphate, GGPP: Geranylgeranyldiphosphate.

FIG. 2 is an illustration of methods used for control of multiplepathway modules. FIG. 2A illustrates the decomposition method. Eachmodule of the pathway was individually controlled by an independenttunable promoter where transcription levels were regulated by thecognate inducers. FIG. 2B illustrates the univariant controlling method.The system was regulated at two dimensions: the ratios of the pathwaymodules were modulated by applying different engineered promoters withvarious strengths and the overall expression were controlled by themaster regulator that simultaneously and equally tunes the level of allpromoters.

FIG. 3 illustrates production inhibition caused by high gene expression.Lycopene yields responding to gene expression controlled by IPTGinductive T7 promoter in BL21-Gold (DE3) strain were measured. FIG. 3Aillustrates cell harboring pAC-LYC (continuously expression of crtE,crtB and crtI genes) and PETK-T7-SIDF plasmids. FIG. 3B illustrates cellharboring pAC-LYC plasmid together with pETK-T7-eGFP () orpETK-T7-t-dxs (▪) or pETK-T7-t-idi (▴) plasmid. pETK-T7-t-dxs (▪) orpETK-T7-t-idi (▴) were engineered to be untranslatable into proteins.Presented data was average of triplicates with standard deviation.

FIG. 4 illustrates optimization of two modules for lycopene productionwith two independent tunable promoters. FIGS. 4A-4C illustrate lycopeneproduction response to simultaneously tuning of pBAD promoter for crtEBImodule and T7, TM2 or TM3 promoter for SIDF module in BL21-Gold (DE3)strain harboring pAC-BAD-crtEBI plasmid together with pETK-T7-SIDF (FIG.4A) or pETK-TM2-SIDF (FIG. 4B) or pETK-TM3-SIDF (FIG. 4C) plasmid. Thedots indicate the lycopene yields and the surfaces were interpolatedbased on triangle-based cubic interpolation. The numbers in the figuresindicate the highest yields achieved experimentally. FIGS. 4D-4Fillustrate transcription levels of SIDF module (represented by dxs mRNAlevel) and crtEBI module (represented crtE mRNA level) at variousinduction conditions in FIGS. 4A-4C. All the transcription levels werenormalized to level of cysG. The circled points indicate the highestlycopene production conditions in the surface and the squares indicatethe covered expression range. FIG. 4G illustrates the combination of thehighest production points and expression ranges in FIGS. 4D, 4E, and 4F.

FIG. 5 illustrates sequences of T7 promoter. The numbers indicate theposition relative to the transcription starting point (+1). Theconserved sequence (bottom arrow), polymerase binding (top left arrow)and melting/initiation (top right arrow) regions are indicated.

FIG. 6 illustrates the expression of eGFP controlled mutant promotersand IPTG. BL21-Gold (DE3) strains harboring eGFP expression plasmids:pAC-TM1-eGFP, pAC-TM2-eGFP or pAC-TM3-eGFP and pRepressor plasmidexpressing lad gene were grown in the presence of different IPTGconcentrations. EGFP was extracted at 48 hrs after induction andmeasured with fluorescence reader (excitation wavelength: 588 nm,emission wavelength: 610 nm). In FIG. 6A, the fluorescence of all theconditions (IPTG, mutant promoters) were normalized to the strongestexpression condition (pAC-TM1-eGFP with 0.3 mM IPTG). In FIG. 6B, thefluorescence of each plasmid with various IPTG inductions wereseparately normalized to the level of strongest induction (0.3 mM IPTG).For each data point, the plasmids, from left to right, are TM1, TM2, andTM3. In FIG. 6C, the fluorescence each IPTG induction for differentmutant promoters were separately normalized to the level of strongestpromoter (TM1 promoter). The error bars represented the standarddeviation of three biological replicates. For each data set on theX-axis (in FIGS. 6B & 6C), the IPTG concentrations are from highest (0.3mm) to lowest (0.011 mm), from left to right.

FIG. 7 illustrates the kinetic of eGFP expression driven by differentpromoters. BL21-Gold (DE3) strains harboring eGFP expression plasmids:pAC-T7-eGFP, pAC-TM1-eGFP, pAC-TM2-eGFP or pAC-TM3-eGFP and pRepressorplasmid expressing lacI gene were grown in the presence of differentIPTG concentrations. Cells were incubated with 2×PY medium at 37° C. inThermo Scientific Varioskan Flash Multimode Reader with shaking and eGFPwas continuously monitored by measuring fluorescence (excitationwavelength: 580 nm, emission wavelength: 610 nm). The error barsrepresented the standard deviation of three biological replicates.

FIG. 8 illustrates unregulated promoters result decreased isoprenoidproduction. BL21-Gold (DE3) strains with or without pRepressor plasmidexpressing the lad repressor that inhibited the transcription from T7based promoters before IPTG induction were introduced with amorphadienesynthetic pathways. The productions of amorphadiene were measured aftervarious IPTG inductions. In FIG. 8A, the amorphadiene synthesis wascarried out through DXP pathway: pAC-TM2-dxs-TM/3-IDF-TM2-ADS plasmid.In FIG. 8B, the amorphadiene synthesis was carried out through MVApathway: pAC-TM3-SBR-TM2-KKDI-TM3-AA plasmid. The error bars representedthe standard deviation of three biological replicates. For each data seton the X-axis, the IPTG concentrations are from highest (0.3 mm) tolowest (0.011 mm), from left to right.

FIG. 9 illustrates in vitro expression of mutant promoters withcompetition. FIG. 9A is an illustration of in vitro transcriptionexperiment. The systems of two or three modules controlled by differentpromoters were combinatorially mixed together in equal amount for thereaction. The modules were standardized using eGFP genes with variousshort tags that were differentially measured by qPCR. To ensure thecompetition, the template concentration was adjusted to a high amount,as much as half of the concentration of T7 polymerase used. In FIGS. 9B& 9D, the transcription result of two (FIG. 9B) or three (FIG. 9C)module systems were presented as the copy of mRNA transcribed from percopy of template DNA. P1, P2 and P3 represented different modules wherepromoters were indicated at X axis. In FIGS. 9C & 9E, the ratio of thetranscription levels between modules where promoters were indicated atthe X axis. The error bars represented the standard deviation of fourreplicates.

FIG. 10 illustrates transcription levels of two modules for lycopeneproduction optimized with univariant controlling approach. BL21-Gold(DE3) strains harboring the combination of plasmids for two modules withdifferent mutant promoters were grown in the presence of different IPTGconcentrations for lycopene production. Two modules used in this studywere the SIDF module (pETK-TM1-SIDF or pETK-TM2-SIDF or pETK-TM2-SIDFplasmid) and the crtEBI module (pAC-TM1-crtEBI or pAC-TM2-crtEBI orpAC-TM3-crtEBI plasmid). Note that both vectors were inducible by IPTG.The copy numbers of the vectors, pET and pAC are 100 and 30,respectively. The transcription level of SIDF module (dxs mRNA) (FIG.10A) and crtEBI module (crtE) (FIG. 10B) were measured with the protocoldescribed in experimental method and normalized to the level of cysG.For each data set on the X-axis, the IPTG concentrations are fromhighest (0.3 mm) to lowest (0.011 mm), from left to right.

FIG. 11 illustrates optimization of two modules for amorphadieneproduction with univariant controlling approach. BL21-Gold (DE3) strainsharboring the combination of plasmids for two modules with differentmutant promoters were grown in the presence of different IPTGconcentrations for lycopene production. Two modules used in this studywere the SIDF module (pETK-TM1-SIDF or pETK-TM2-SIDF or pETK-TM2-SIDFplasmid) and the crtEBI module (pAC-TM1-crtEBI or pAC-TM2-crtEBI orpAC-TM3-crtEBI plasmid. FIG. 11A illustrates lycopene production at allconditions. The combination of pathway modules (SIDF: top; crtEBI:bottom) were presented at X axis and the IPTG concentrations werepresented as different bars. For each data set on the X-axis, the IPTGconcentrations are from highest (0.3 mm) to lowest (0.011 mm), from leftto right. FIG. 11B illustrates lycopene production response to theexpression levels of two modules. FIG. 11C illustrates lycopeneproduction response to the combination of mutant promoters for twomodules. Only the optimum lycopene yields at various IPTG concentrationswere presented. The color of the dots indicates the lycopene yields.

FIG. 12 illustrates rational optimization of lycopene production. Asimple rational workflow can be used to guide strain development.Firstly, a screening experiment can be conducted using only high (TM1)and low (TM3) strength promoters with various IPTG inductions todiscretely cover the searching range (FIG. 12A). The response of thesystem to mutant promoters will reveal that applying stronger promotersfor the expression of crtEBI module than SIDF module gives better yields(FIG. 12B). A second round of focused experiment around the optimumconditions deduced from the screening experiment can be then carried outand the optimum condition (pETK-TM3-siDF, pAC-TM2-crtEBI, 0.1 mM IPTG)will be attained (FIGS. 12C & 12D). By such approach, an optimalcondition can be easily identified without the need to search for more.FIGS. 12A & 12B illustrate the initial screening experiment. FIGS. 12C &12D illustrate the focused experiment. The color dots indicated theexperimental conditions and the crosses indicate the conditions thatwere unnecessary to test further after initial screening study. Thecolor of the dots indicated the lycopene yields. FIGS. 12A & 12Cillustrate lycopene production response and the expression levels of thetwo modules. FIGS. 12B & 12D illustrate lycopene production response tothe combination of two modules using mutant promoters. The color of thedots indicated the lycopene yields.

FIG. 13 illustrates optimization of three modules for amorphadieneproduction with univariant controlling approach. DXP or MVA pathway wasapplied for amorphadiene synthesis in either BL21-Gold DE3 or MG1655 DE3strain. The combination of TM1, TM2, TM3 promoters were used to drivethe expression of three modules for either DXP pathway approach(pAC-TM-dxs-TM-IDF-TM-ADS plasmid) or MVA pathway approach(pAC-TM-SBR-TM-KKID-TM-AA plasmid). Strains harboring the pathway andpRepressor plasmid expressing lad gene were grown in the presence ofdifferent IPTG concentrations. Amorphadiene yields were presented inFIG. 13A: BL21-Gold DE3 strain, DXP pathway, FIG. 13B: MG1655 DE3strain, DXP pathway, and FIG. 13C: MG1655 DE3 strain, MVA pathway. Thecombination of pathway modules were presented at X axis and the IPTGconcentrations were presented as different bars. For each data set onthe X-axis, the IPTG concentrations are from highest (0.3 mm) to lowest(0.011 mm), from left to right.

FIG. 14 illustrates transcription level of selected strains with threemodules on pAC vector. BL21-Gold (DE3) strains harboring selectedplasmids: pAC-TM1-dxs-TM2-IDF-TM1-ADS plasmid,pAC-TM2-dxs-TM1-IDF-TM3-ADS plasmid or pAC-TM3-dxs-TM3-IDF-TM2-ADSplasmid and pRepressor plasmid (expressing lad gene) were grown in thepresence of different IPTG concentrations. The combination of pathwaymodules were presented at X axis and the IPTG concentrations werepresented as different colored bars. The transcription level of eachmodule (dxs, IDF, ADS) was measured with the protocol described inexperimental method and normalized to the level of cysG. The error barsrepresented the standard deviation of three biological replicates. Foreach data set on the X-axis, the IPTG concentrations are from highest(0.3 mm) to lowest (0.011 mm), from left to right.

FIG. 15 illustrates amorphadiene production response to the relativeexpression levels. Strains harboring different pAC-TM-dxs-TM-IDF-TM-ADSplasmids with the combination of TM1, TM2, TM3 promoters on threemodules and pRepressor plasmid expressing lad gene were grown in thepresence of different IPTG concentrations for amorphadiene production.FIGS. 15A & 15C illustrate amorphadiene production in BL21 Gold-DE3(FIG. 15A) or MG1655 DE3 (FIG. 15C) strains response to the relativeexpression level (a.u.) of three modules calculated by “Equation 1”. Thecolor of the dots indicated the lycopene yields. The high productionconditions (more than 50% of the maximum yield) are presented in (FIG.15B) for B121-Gold DE3 strain and (FIG. 15D) for MG1655 DE3 strain.

FIG. 16 illustrates ternary plot representation of the amorphadieneproduction. FIG. 16A: BL21-Gold DE3 strain, DXP pathway; FIG. 16B:MG1655 DE3 strain, DXP pathway; and FIG. 16C: MG1655 DE3 strain, MVApathway. The percentage of each module was calculated based on “Equation1”. Only the optimum amorphadiene yields at various IPTG concentrationswere presented at each point and the color of the dots indicate theresponse yield.

FIG. 17 illustrates ternary plot of amorphadiene response to ratios ofthree modules. FIG. 17A is an illustration of a ternary plot. Eachspecies (FIG. 17A, 17B, or 17C) is 100% at the corner of the equilateraltriangle and every point represents a different composition of the threecomponents. By drawing parallel lines along the borders, the percentageof each species is equal to the length of the line aiming at theopposite border. FIG. 17B is an illustration of the rationaloptimization process. The red cycled points indicated the initialscreening experimental conditions which separated the space into sixregions (I, II, III, IV, V, and VI). Based on the yields at variouspoints, the follow-up focused experiments were then carried out atselected region or regions defined by dashed lines in FIGS. 17C & 17D.FIGS. 17C & 17D are ternary plots of three modules in BL21-Glod DE3(FIG. 17C) or MG1655 DE3 (FIG. 17D) strain. The percentage of eachmodule was calculated based on “Equation 1”. Only the optimumamorphadiene yields at various IPTG concentrations were presented ateach point and the color of the dots indicate the yield. The regionsdefined by gray area indicate the focused conditions resulting from theinitial screening study.

FIG. 18 illustrates extracellular metabolites accumulation.Extracellularly accumulated metabolites of DXP pathway were measured forBL21-Gold DE3 strain in conditions same as amorphadiene productionoptimization through DXP pathway. In FIG. 18A, efflux of DXP pathwayintermediates into the growth medium. DXP (1-Deoxy-D-xylulose5-phosphate) and MEC ((E)-4-Hydroxy-3-methyl-but-2-enyl pyrophosphate)were found highly accumulated in the medium. FIG. 18B illustrates thecorrelation between amorphadiene and extracellular MEC. All theconditions were presented. FIG. 18C is a ternary plot representation ofthe extracellular MEC concentrations responded to the ratios of pathwaymodules’. Only the optimum amorphadiene yields at various IPTGconcentrations were presented at each point and the color of the dotsindicate the response yield.

FIG. 19 illustrates accumulation of MEC and DXP in the medium. BL21-Gold(DE3) strains harboring pAC-TM-dxs-TM-IDF-TM-ADS plasmid withcombinations of TM1, TM2, TM3 promoters and pRepressor plasmid(expressing lacI gene) were grown in the presence of different IPTGconcentrations. The combination of pathway modules were presented at Xaxis and the IPTG concentrations were presented as different coloredbars. The concentrations of amorphadiene (FIG. 19A), MEC (FIG. 19B) andDXP (FIG. 19C) in the medium were measured at the end point. To notethat one molecule of amorphadiene is synthetized from three molecules ofMEC or DXP. For each data set on the X-axis, the IPTG concentrations arefrom highest (0.3 mm) to lowest (0.011 mm), from left to right.

FIG. 20 illustrates the cross-lapping in vitro assembly (CLIVA) method.FIG. 20A is an illustration of the design at one junction between twomodules (black and gray). The cross-lapping primer consists of genespecific sequence (GSS) and tag sequence complementary to adjacentprimer's GSS. The phosphorothioate modifications were indicated ascycles. An “Ox/y” designation was used to define the primers, where Odenoted overlap; x was the length of overlap which had one modificationat each y base pairs of the sequence. FIG. 20B is an illustration ofassembling of multiple DNA modules into one plasmid.

FIG. 21 illustrates optimization of CLIVA method. FIG. 21A illustratesoptimization of cations using the assembly of PAC-SIDF plasmid withO12-13/4-5 design (12-13 bases overlap with modification at every 4-5bases). FIG. 21B illustrates the transformation efficiency of PAC-SIDFplasmid in the presence of MgCl2. FIG. 21C illustrates the effect of thephosphorothioate modification frequency on the assembly efficiency.O12-13/4-5, O12-13/6-7, O12-13/12-13 designs: 12-13 bases overlap withmodification at every 4-5 bases, 6-7 bases or 12-13 bases. FIG. 21Dillustrates the effect of overlap length on the assembly efficiency.O12-13/4-5, O24-25/4-5, O36-38/4-5 designs: 12-13 bases, 24-25 bases,36-38 bases overlap with modification at every 4-5 bases. All theexperiments were done at triplicates and the standard errors were shownin the figure.

FIG. 22 illustrates assembly of DXP pathway. FIG. 22A illustrates thedxp pathway and Fe—S cluster assembling pathway. GA3P: glyceraldehyde3-phosphate, DXP: 1-deoxy-D-xylulose 5-phosphate, MEP:2C-methyl-D-erythritol 4-phosphate, CDP-ME: 4-diphosphocytidyl-2C-methylD-erythritol, CDP-MEP: 4-diphosphocytidyl-2C-methyl D-erythritol2-phosphate, MEC: 2C-methyl-D-erythritol 2,4-diphosphate, HMBPP:hydroxylmethylbutenyl diphosphate, IPP: Isopentenyl pyrophosphate,DMAPP: Dimethylallyl pyrophosphate, GPP: Geranyl diphosphate, FPP:Farnesyl diphosphate, GGPP: Geranylgeranyl diphosphate. FIG. 22B is anillustration of various modules assembled in the project (correlated toTable 7). CAM: chloramphenicol resistance gene, p15A-ori: p15A originalof replication.

FIG. 23 illustrates the performance of different combinations of DXPpathway genes in E. coli. FIG. 23A illustrates 48 h amorphadiene yield.Different concentrations of IPTG were represented by bars with differentcolors. The experiment was repeated four times and the standard errorswere shown. FIG. 23B illustrates the correlation of pathway modules withamorphadiene yield at optimal IPTG inductions. FIG. 23C illustratesearly response of intracellular metabolites at 3 h after induction. Thegray areas indicated the overexpressed section of DXP pathway. Theexperiment was repeated twice and the averages were shown.

FIG. 24 illustrates the kinetics of S-IAA-PAC, S-R-IAA-PAC andS-R-DEF-IAA-PAC strains. FIG. 24A illustrates the specific concentration(μM/OD) of intracellular metabolites: DXP, MEP and MEC. The rest of themetabolites were accumulated at concentrations lesser than 2 μM/OD andwere neglected. FIG. 24B illustrates the concentration of extracellularmetabolites: DXP, MEP, MEC and amorphadiene. The rest of the metaboliteswere accumulated at concentrations lesser than 50 μM and were neglected.FIG. 24C illustrates the cell density.

FIG. 25 illustrates the effects of Fe—S operons on the amorphadieneproduction. Different concentrations of IPTG were represented by barswith different colors. The experiment was repeated four times and thestandard errors of four replicates were presented as error bars. The twotailed p-values of student's t-test were carried out to compare certainconditions and presented as P in the figure. For each data set on theX-axis, the IPTG concentrations are from highest (0.3 mm) to lowest(0.011 mm), from left to right.

FIG. 26 illustrates different cations' effects on the assemblyefficiency. The assembling efficiencies of PAC-SIDF plasmid withO36-38/4-5 design (36-38 bases overlap with phosphorothioatemodification at each 4-5 bases) at 2.5 mM (left) or 12.5 mM (right) ofMgCl₂, CaCl₂, CoCl₂ or CuCl₂ were presented. All the experiments weredone at triplicates and the standard error were presented in the figure.

FIG. 27 illustrates the assembly efficiency of overlap designs withsingle phosphorothioate modification. O12-13/12-13, O24-25/24-25,O36-38/36-38: 12-13 bases, 24-25 bases, 36-38 bases homologous sequenceswith one phosphorothioate modification. All the experiments were done attriplicates and the standard error were presented in the figure.

FIG. 28 is the sequence of codon optimized ADS gene, SEQ ID NO: 29.

DETAILED DESCRIPTION OF THE INVENTION

In order to simultaneously control a number of promoters with differentstrengths, there is a need for the use of a single resource (inducer andtranscribers/polymerases) that can modulate these promoters for theexpression of multiple down-stream genes. These promoters are hereinreferred to as ‘dependent promoters’, as they all are dependent on thesame externally controlled resource for functions. If there is anyperturbation in the availability of the resource, the expressions fromeach of this dependent promoter should change accordingly and expressionof down-stream coding regions (e.g., genes) should change in fixedproportions based on the strength of the promoters. In addition, bytuning the availability of the resource, all these promoters with thesame control mechanism should behave similarly, providing another layerof systematic control—the overall expression level.

μ-UNeICS

This application described the development of a novel tool (μ-UNeICS)using a plethora of currently available methods for the co-expression ofmultiple enzymes (coding regions, in general) in pathways controlled bya single heterologous/extrinsic transcriber. The result of which is theretention of a constant ratio of expressions when a single type(univariant) of extrinsic transcriber is distributed over multiplepromoters of different strengths and all the promoters responseaccordingly to induction no matter if and when competition for resourcesexist. The performance of the expression system is well controlled andcan be predicted with a simple model. This systematic method allowsunprecedented control of a wide dynamic range and the rapididentification of the optimal combinations of fixed ratio ofpromoter-driven expressions. Furthermore, by gaining insightfulunderstanding of the pathways, a rational optimization process can beapplied to efficiently identify the global optimum. The utility of thismethod is in industries such as energy, health (pharmaceuticals) andenvironment by manipulating genetic and metabolic pathways (syntheticbiology, metabolic engineering). Advantageously, identification of theoptimal combinations of the fixed ratio of promoter-driven expressions,saves labor, time and experimental resources. Through this effort,previously unpredicted combinations of some isoprenoid genes wererapidly determined to result in the generation of high producingstrains.

In a first aspect, the invention provides isolated nucleic acids (e.g.vectors) containing a first coding region and a second coding region.The first coding region encodes at least a first gene product, where thefirst coding region is operably linked to a first inducible promoter,the first inducible promoter being of a first strength and beingresponsive to an inducer. Similarly, the second coding region encodes atleast a second gene product, where the second coding region is operablylinked to a second inducible promoter, the second inducible promoterbeing of a second strength, different from the first strength, and beingresponsive to the inducer. In other embodiments, the invention providescollections of islated nucleic acids, e.g., kits of two or more vectors,analogous to the signgle nucleic acid embodiment described above exceptwhere the two coding regions and their respective promoters are ondifferent vectors in the kit.

A “coding region” is a nucleic acid comprising a sequence encoding aprotein. A coding region may include one or more coding regions,including, for example, a multi-gene polycistron, such as an operon,from any source—either synthetic or naturally occurring. The codingregion can comprise any protein, such as a cytokine, a growth factor, anenzyme, an antibody (or antibody mimetic), a receptor, or a structuralprotein. In certain embodiments, the coding region comprises an enzyme.

A coding region and a promoter, such as an inducible promoter, are“operably linked” when the promoter can modulate the transcription ofthe coding region, under appropriate conditions. In some embodiments,two sequences can be in operative association, and additional sequenceelements such as enhancers or promoters may be present in the construct.For example, in certain embodiments, the polycistron includes ribosomebinding sites in between open reading frames.

An “inducible promoter” is a promoter region whose activity can bemodulated in trans by an inducer and includes promoters subject toeither direct or indirect modulation by the inducer. Modulation caninclude, for example, direct activation (adding an inducer permits anelement needed for transcription to function) or direct derepression(adding an inducer removes an element that is inhibiting normaltranscription). Indirect activation (or derepresison) can includemodulating the transcription of another agent that modulatestranscription of a coding region. The present invention illustrates thislater example by employing variant IPTG-inducible T7 polymerasepromoters on the coding regions of interest, while expressing the T7polymerase from another IPTG-inducible promoter, thereby directly andindirectly inducing the coding region of interest. Other promoters andagents can be used analogously, consonant with the present invention.Exemplary promoters for use in the invention include BAD (arabinoseinducible; see e.g. Schlief, R. Trends in Genetics 16(12):559-565(2000)), lac, Tet, RNA polymerase promoters (T7, T3, or SP6), any kindof engineered promoter in operative association with operon(s) thatmakes it inducible, and combinations of any of the foregoing. Inparticular embodiments, the promoters include T7 family members, such asany one of SEQ ID NOs: 1-12. In more particular embodiments, thepromoters include SEQ ID NO: 3 (also called TM1, herein), SEQ ID NO: 7(TM2, herein), and SEQ ID NO: 9 (TM3, herein). Melting and initiationregions of an RNA polymerase are exemplified by nucleotides 8 to 19 and20 to 28 of SEQ ID NO: 1, respectively. Promoters of different strength,based on the T7 promoter are exemplified by SEQ ID NOs: 1-12. Promotersof varying strength can be produced from other promoters analogously tothe above examples for T7.

In certain embodiments, inducible promoters for use in the presentinvention are coupled to heterologous coding sequences—i.e., thecombination of promoter and coding sequence is a product of man that isnot naturally occurring.

Plasmids provided by the invention can be for exogenous maintenance as anucleic acid(s) separate from a host genome or, in other embodiments,for integration into the host's genome. Plasmids can be single copy, lowcopy (e.g. less than 10 copies per cell, such as about: 2, 3, 4, 5, 6,7, 8, or 9 copies per cell) or high copy (e.g. more than 10 copies percell, such as about: 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, or100 copies per cell, or more).

In some embodiments, the multi-step enzymatic pathway is an isoprenoidproduction pathway. In more particular embodiments, the isoprenoid is alycopene or amorphadiene. The multistep pathway can be either the DXP orMVA pathway. See FIGS. 1, 22B

In particular embodiments, the coding region includes one or more genesselected from dxs (see, e.g., E. coli GeneID No. 945060), idi (see,e.g., E. coli GeneID No. 949020), ispA (see, e.g., E. coli GeneID No.945064), ispD (see, e.g., E. coli GeneID No. 948269), ispF (see, e.g.,E. coli GeneID No. 945057), crtE (see, e.g., Pantoea agglomeransphytoene synthase ACCESSION No. M38424.1), crtB (see, e.g., Pantoeaagglomerans prephytoene pyrophosphate synthase ACCESSION No. M38423.1),crtI (see, e.g., Pantoea agglomerans phytoene dehydrogenase ACCESSIONNo. M38423.1), ADS (see, e.g., SEQ ID NO: 29, see also protein sequenceAAF98444.1), hmgS (see, e.g., Saccharomyces cerevisiae GeneID No.854913), atoB (see, e.g., E. coli GeneID No. 946727), hmgR (see, e.g.,Saccharomyces cerevisiae GeneID No. 854900), MVK (see, e.g.,Saccharomyces cerevisiae GeneID No. 855248), PMVK (see, e.g., E. coliGeneID No. Saccharomyces cerevisiae GeneID No. 855260), MVD (see, e.g.,Saccharomyces cerevisiae GeneID No. 855779), Isc operon (iron-sulfurcluster, or a portion thereof), Suf operon (sulfur mobilization operon,or a portion thereof) or a combination of the forgoing. In a relatedaspect, the invention provides an isolated nucleic acid comprising,consisting essentially of, or consisting of SEQ ID NO: 29, or abiologically active fragment thereof.

Homologs or substantially similar peptide sequences to any of theforegoing proteins can be used in the invention. “Similar peptidesequences” can be naturally occurring (e.g., allelic variants orhomologous sequences from other species) or engineered variants to theabove reference sequences and will exhibit substantially the samebiological function and/or will be at least about 60, 65, 70, 75, 80,85, 90, 95, 96, 97, 98, 99% or more homologous (i.e., conservativesubstitutions (see, e.g., Heinkoff and Heinkoff PNAS 89 (22):10915-10919 (1992) and Styczynski et al., Nat. Biotech. 26 (3): 274-275(BLOSUM, e.g., BLOSUM 45, 62 or 80) or Dayhoff et al., Atlas of proteinsequence and structure (volume 5, supplement 3 ed.). Nat. Biomed. Res.Found. pp. 345-358 (PAM, e.g., PAM 30 or 70)) or identical at the aminoacid level, e.g., over a length of at least about 10, 20, 40, 60, 80,100, 150, 200 or more amino acids or over the entire length of themature reference peptide sequence.

In particular embodiments, the coding region of a plasmid provided bythe invention includes: dxs, idi, ispD, and ispF (siDF); crtE, crtB, andcrtI (crtEBI); dxs; idi, ispD, and ispF (iDF); ADS; hmgS, aroB, and hmgR(SBR); MVK, PMVK, MVD, and idi (KKDI); ADS and ispA (AA); or acombination thereof. In more particular embodiments, the plasmidsprovided by the invention include any of those described in Tables 2 or7.

Any suitable cell can be a host cell transfected with a nucleic acid(e.g., vector) provided by the invention. In particular embodiments, thecell is a bacterium, a yeast cell, an insect cell, or a mammalian cell.In more particular embodiments, the cell is a bacterium, such as E.coli, and in more particular embodiments, the E. coli is selected fromExt-10-gold, DH10B, or K12 (including MG1655, such as MG1655 DE3). Incertain embodiments, the cell comprises a functional lad gene and inmore particular embodiments, the cell expresses a polymerase (such as aT7 polymerase) from a lac promoter, more particularly a lacI-repressablelac promoter. In particular embodiments, the cell (e.g., a bacterium,such as E. coli) comprises one or more nucleic acids comprising:TM3-SBR-TM2-KKDI-TM3-AA (e.g. in plasmid pAC); TM3-siDF (e.g. in pETK);TM2-crtEBI (e.g. in pAC); or a combination thereof, such as TM3-siDF(e.g. in pETK) and TM2-crtEBI; TM2-SBR-TM1-KKDI-TM3-AA (e.g. in plasmidpAC); TM1-dxs-TM2-IDF-TM1-AA (e.g. in plasmid pAC);TM2-dxs-TM3-IDF-TM2-AA (e.g. in plasmid pAC); TM3-siDF (e.g. in pETK);and TM1-crtEBI.

In related aspects, the invention provides methods of: expressing one ormore coding regions (e.g., by providing a host cell comprising one ormore vectors provided by the invention, contacting the cell with theinducer under conditions to express the one or more coding regions),making a product of a multi-step enzymatic pathway (e.g. by providing ahost cell comprising one or more vectors provided by the invention,contacting the cell with the inducer under conditions to express the oneor more coding regions, and detecting and/or isolating the product ofthe multi-step enzymatic pathway—such as lycopene or amorphadiene), aswell as methods of optimizing the yield of a product of a multi-stepenzymatic pathway (for example, by determining optimal levels of atleast first and second coding regions—e.g., enzymes—in the pathway,determining the ratio of strengths of inducible promoters for the codingregions and then providing one or more expression vectors provided bythe invention with the coding regions operably linked to induciblepromotes of suitable strengths).

Optimal levels of expression for a given system can be determined by anymeans. In certain embodiments, the levels are determined according toEquation 1, below, or an analogous equation (for example, replacing IPTGinduction strength, with simply induction strength, and mutant promoterstrength with simply promoter strength, et cetera) depending on theparticular system employed. In some embodiments, various permutations ofcoding regions and promoters are screened and an output, such as apathway product, is measured to identify an optimum under givenconditions (e.g., culture conditions). In other embodiments, the systemcan be modeled computationally, e.g., using analytical, numerical,and/or computer-learning modalities. In still other embodiments, asystem can be both modeled and screened. The starting point for anyscreening or modeling can, in some embodiments, be rationally designedand iteratively modified based on the results of modeling and/orscreening (e.g. modeling after screening, or vice versa, as well asiteratively screened or modeled with finer resolution at eachiteration). Optima for a given pathway can vary between organisms orstrains of an organism based on, inter alia, cell genotypes, cultureconditions, et cetera.

As a proof-of-concept, this examples below demonstrate how “UNivariantextrinsic Initiator Control System for microbes (p-UNeICS)” was appliedin the production of isoprenoids (terpenoids), which are a large familyof natural compounds that can be used as fragrances, insecticides,nutraceuticals and pharmaceuticals. This systematic approach isextendable to system with, e.g., four or even more modules andapplicable to all processes involving the modulation of multiplerecombinant DNAs in microbes for any purpose.

CLIVA

In another aspect, the invention provides methods of nucleic acidassembly, such as gene cloning, which is termed CLIVA (Cross-lapping InVitro Assembly), herein. In these methods provided by the invention, afirst nucleic acid, such as a coding region is joined to at least asecond nucleic acid, such as a vector, by virtue of complementary stickyends between the first and second nucleic acids. In particularembodiments, the sticky ends are created and, optionally, hybridized,non-enzymatically, e.g., without a nuclease or a ligase. Instead, thenucleic acids are cleaved (using iodine in an ethanolic solution) atphosphorothioate modifications in the nucleic acid backbone of eachnucleic acid to be joined. This process is illustrated in FIG. 20. Thesemethods can further include a step of transforming a cell with thejoined first and second nucleic acids.

Briefly, these methods, in certain embodiments, employ an amplificationstep with a pair of primers for each nucleic acid to be joined. Eachprimer in a pair has at least two regions: a “primer region”, generallyat the 3′ end of the primer and a “homologous sequence”, generally atthe 5′ end of the primer. A “primer region” comprises a conventionalpolymerase chain reaction (PCR) primer for amplifying the nucleic acidto which it hybridizes (e.g. a first sequence). A “homologous region”,in turn, comprises a sequence that can hybridize to another sequence—thesequence to which the first sequence is to be joined. For example, insome embodiments, a homologous region can hybridize to a sequence within(or comprising) the primer region of another primer. Followingamplification with this primer pair, the amplified nucleic acid includesthe first sequence and two homologous regions—where at least one strandof each homologous region comprises at least one phosphorothioatelinkage. Following cleavage of the phosphorothioate linkage,complementary single-stranded sticky ends (overhangs) are generated—twosicky ends per amplified nucleic acid. Following this basic designscheme, numerous fragments can be joined together, such as at least 2(e.g. a nucleic acid of interest and a vector), or 3, 4, 5, 6, 7, 8, 9,10, 12, 13, 14, 15 nucleic acids, or more.

Primer regions will be designed according to standard practices for PCRprimer design, taking into account the complexity of the nucleic acidmixture, desired melting temperature, secondary structure, dimerization,et cetera. Homologous sequences can be designed according to theparticular construct to be generated. Typically, homologous sequenceswill have a length after cleavage of the phosphotioate modification suchthat the single-stranded overhangs are at least about: 12, 13, 14, 15,16, 17, 18, 19, 20, 25, 30, or 35 nucleotides in length, or more. Inparticular embodiments, the single-stranded overhangs are about 32-42nucleotides, more particularly about 36-38 nucleotides. Sequences thatwill hybridize (e.g., overhangs) can comprise both primer regionsequences and homologous sequences.

Primers can have varying densities of phosphorothioate modifications.Typically, the first phosphorothioate modification is at about: the2^(nd), 3^(rd), 4^(th), 5^(th), or 6^(th) nucleotide, from the 3′ end ofthe primer. In more particular embodiments, the first phosphorothioatemodification is at the 3^(rd), 4^(th), or 5^(th) nucleotide, from the 3′end of the primer. The phosphorothioate modifications can be repeatedeach about: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 bases. Incertain embodiments, the phosphorothioate modifications are repeatedevery about: 12-13 bases, 6-7 bases, or 4-5 bases. In more particularembodiments, the phosphorothioate modifications are repeated every 4-5bases. Phosphorothioate modifications can be in the primer region or inthe homologous sequence or both in the primer region and in thehomologous sequence. From 5′ to 3′, the last phosphorothioatemodification typically needs to be at the last bases of the homogloussequence.

Following the methods provided by the invention, numerous fragments canbe assembled in an “annealing reaction” where amplified nucleic acidswith complementary sticky ends are allowed to hybridize via the stickyends. In certain embodiments, the annealed nucleic acids can be usedas-is, e.g., to transform a cell without further purification, forexamples, without a ligation reaction—although, in certain embodiments,the assembled nucleic acids can be purified and, optionally, ligated.

In some embodiments, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or30 nucleic acids, or more (e.g. 40, 50, 60, 70, 80, 90, or more) can beassembled in a single reaction. The final assembled product (e.g. acollection of inserts for a plasmid) can be at least about: 8, 10, 12,14, 15, 16, 18, 20, 25, 30, 35, 40, 45, or 50 kb, or more, e.g., inparticular embodiments, about 8 kb to about 22 kb. Advantageously, themethods provided by the invention allow the nucleic acids to beassembled quickly, for example in about: 12, 18, 24, 30, 36, 42, 48, 54,or 60 hours—e.g., in some embodiments, about 1-2 days, as compared toone to two weeks, or more, using conventional methods.

The annealing of nucleic acid fragments to be joined by the methodsprovided by the invention typically takes place in the presence of oneor more cations. In more particular embodiments, the one or more cationsare divalent cations (e.g. Mg²⁺, Ca²⁺, Co²⁺, or Cu²⁺). In still moreparticular embodiments, the divalent cation is Mg²⁺, Ca²⁺, or acombination thereof. In particular embodiments, the divalent cation ispresent in the annealing reaction at a concentration of about 0.5 toabout: 10.0, 20.0, 30.0, 40.0, 50.0, or 60.0 mM; in more particularembodiments about: 1.0, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 7.5, 10.0,12.5, 15.0, 17.5 mM. In more particular embodiments, the divalent cationis present at a concentration of about 2.5 to about 12.5 mM.

EXEMPLIFICATION Example I UNivariant Extrinsic Initiator Control Systemfor Microbes (μ-UNeICS) Background and Motivation

It is now known that most metabolic pathways are not restricted by asingle rate-limiting step. The exploitation of the pathway for theproduction of metabolites will require the optimal expression of severalnative and/or heterologous enzymes in tightly coordinated manner.Failure to do so will invariably result in undue metabolic burden wheremetabolic imbalance can lead to the accumulation of intermediatemetabolites or gene products with potential cytotoxicity or, in somecases, may affect normal cell growth. Besides, the stress caused by theoverexpression of enzymes (proteins) which can be insoluble will inducethe selection of low producers during fermentations. Thus, a significantchallenge of using microbial cells as biofactories is to optimallybalance the expressions of number of enzymes in a pathway wheremultivariate optimization is necessary.

A number of tools are currently available to allow the fine modulationof gene expression in a pathway. This include methods for generatingrandomized genetic knockouts and overexpression libraries, syntheticpromoter libraries, tunable intergenic regions, and global techniques(e.g., artificial transcription factor engineering, ribosomeengineering, global transcription machinery engineering, and genomeshuffling).

Promoters, both constitutive and inducible, have long been used tocontrol gene expressions. The genetic engineering of promoters ofvarious strengths has produced large libraries which have been usedpredominately to precisely control the expression of a single or smallnumber of genes.

To differentially control a large number of genes, it is common to usemultiple promoters with different strengths combined with variousgenetic carriers such as plasmids of different copy numbers. A distinctdisadvantage of this approach is that there are a restricted number ofsuch regulatory elements where the ability to tune the expression of theGOI is limited. Furthermore, when differential expressions of multiplegenes are required, the search for an optimal condition is oftenextensively time and resource consuming due to the permutation of theregulatory elements to be used. In addition, the multiple controlelements use divergent mechanisms which are subjected to differentglobal cellular controls. Because of these constraints, it will bedifficult to predict the response of the system when engineered, thusreducing the chance of finding the optimal condition rapidly. Hence,simultaneously optimization of the expression of a number of genes in apathway is still highly empirical, unpredictable and time consuming.Currently, there is no way of knowing if an optimal is achieved bytuning with the existing tools and methods, making these highlyunsatisfactory. Hence, a tacit demand, yet to be met, is a systematicmethod to enable the optimization of the expression of multiple genecassettes with predictable and well-controlled manner to enable theidentification of an optimal set of parameters in a multidimensionalspace.

All isoprenoids are synthesized from two building blocks (IPP and DMAPP)by various synthase of the DXP or the MVA pathway and these can beheterologously expressed in E. coli (FIG. 1). To produce these twoprecursors through MVA pathway in E. coli, the whole heterologouspathway (hmgS, hmgR, MVK, PMVK, MVD) including a native upstream enzymeatoB is required to be overexpressed (FIG. 1, MVA pathway) while severalrate limiting steps have been identified for the native DXP pathwayincluding the committed step, dxs, and three intermediate enzymes (ispD,ispF, idi (FIG. 1, DXP pathway). These rate limiting steps forisoprenoids production through either pathway have been divided into twoor three pathway modules where expression levels were altered andoptimized by varying their promoter types or recombinant plasmid copynumbers.

In this paper, the overexpression of some genes in the DXP pathway (dxs,ispD, ispF, idi) and heterologous MVA pathway were used as the focusedmodules for the development of tools. A series of novel methods andtools for simultaneously tuning of multiple pathway modules weresystematically developed to optimize the production of isoprenoids (FIG.2). A decomposition method was first explored to individually regulatethe pathway modules with various available tunable promoters (FIG. 2A).These independent promoters (uses different cellular resources tocontrol each promoter) allowed each component in the module to besimultaneously altered by varying the concentrations of their inducers.The ease of continuously altering the expression of GOI by modulatingthe promoter using exogenously added inducers makes it a convenientmethod. This decomposition approach was successfully utilized tooptimize lycopene—a C40 isoprenoid production with DXP pathway by tuningthe upstream pathway module (SIDF) together with the downstream module(crtEBI) (FIG. 1). A global optimum could consistently be observed whileexcessive overexpression of the components in the modules resulted ininhibitory effects revealing the importance of pathway balancing. Asthis method is purely trial-and-error, the utility is severely limitedby the enormous possible number of conditions to be tested withincreasing permutation of modules.

Instead of decomposition, another systematic method was developed bytreating the expression of multiple modules as an integrated process.The optimal condition for productivity is modulating at two orthogonaldimensions—the ratio between pathway modules and the overall expressionlevels of each component (FIG. 2B). Based on the strong original T7promoter, mutant promoters with different strengths were generated andcombinatorially controlling various modules to produce different ratiosof genes expressed. On another dimension, the extrinsic transcriber,acting as a master regulator, alters the overall expression level byuniformly tuning all the promoters independent of their strength as awhole system. This univariant controlling method was initiallydemonstrated by optimization the production of lycopene and thenextended to engineer a three module (S, ISF and ADS) synthetic pathwayfor amorphadiene—a C15 isoprenoid production (FIG. 1). According to theresults, the system was mainly restricted by a single global boundarymaking it possible to carry out a rational optimization with theunivariant controlling method, minimizing the number of strains toconstruct. The method was further successfully applied to identify theoptimum condition for amorphadiene synthesis through MVA pathway bysimultaneously tuning three pathway modules (SBR, KKDI and AA, FIG. 1).In addition, the properties and robustness of the invented tools werecharacterized at transcription and translation levels. The robustness ofthe system was also characterized by proving that two dimensions ofcontrol had no interaction and the engineered promoters would keep theirrelative strength at various conditions.

Results Gene Overexpression Reduces Lycopene Production

Lycopene (C40 isoprenoid), an effective antioxidant, was initiallysynthesized in E. coli with the overexpression of four bottleneckenzymes dxs, idi, ispD, ispF in DXP pathway as well as three plant genescrtE, crtB, crtI separating into upstream (SIDF) and downstream (crtEBI)modules (FIG. 1). The excessive overexpression of enzymes can inhibitisoprenoid production. In order to investigate this issue, the SIDFmodule was expressed under control of inducible T7 promoter (pET-T7-SIDFplasmid) together with a constitutively expressed crtEBI module (pAC-LYCplasmid) in E. coli BL21-Gold DE3 strain. As predicted, the yield oflycopene increased initially but decreased at higher inductions (FIG.3A). Initially, the hypothesis was that strong overexpression ofupstream enzymes might interfere with the expression of downstreamisoprenoid synthetic genes but it was later found to be incorrect.Further studies were carried out to test if the inhibition was caused bythe function of expressed enzymes or the expression process itself. Anenhanced green fluorescence protein (eGFP) without enzymatic activityand noncoding version of the dxs and idi genes (t-dxs and t-idi),respectively, serving as translation and transcription controls wereoverexpressed at various levels together with a constitutive expressionof crtEBI for lycopene production (FIG. 3B). The t-dxs and t-idi genesencoded dxs and idi, respectively, and were modified by deleting theribosome binding sites as well as the start codons (ATG), thus disablingtranslation into proteins in E. coli. Based on the results, it waslikely that the overexpression process, mainly due to synthesis ofproteins, posed a global biochemical limitation that burdened the cellsand inhibited the isoprenoid production.

Optimization of Lycopene Production with Two Independent TunablePromoters

In order to minimize the burden caused by overexpression, limitedamounts of the bottleneck enzymes should be expressed. Hence, it wasnecessary to distribute the quota of resources to distinct pathwaymodules in a balanced manner to maximize the overall flux towards theproduct. Tunable promoters, where expression levels are conveniently andcontinuously modulated by the cognate transcribers, are highly desirablefor rapid identification of the optimal condition (FIG. 4A). Todemonstrate, the SIDF and the crtEBI modules were driven by two distinctindependent tunable promoters: IPTG inducible T7 promoter and arabinoseinducible pBAD promoter. A two-dimensional search was carried out byvarying both inducers simultaneously (FIG. 4A). The expression of eithermodule, where genes were expressed as a polycistron, was monitored bythe transcription level of the first enzyme in each of the module (dxsfor SIDF module and crtE for crtEBI module). Shown in the transcriptionresult (FIG. 4D), both promoters can be independently and consecutivelyregulated. In the search space, a smooth lycopene response surface withonly one optimum at high arabinose (3.3 mM) and low IPTG (0.011 mM)inductions was observed. A minimum induction was required for SIDFmodule indicating that the strong T7 promoter creating superfluousstress may not be suitable for usage here.

Design and Construct Mutant T7 Promoters

To alter the expression range, the T7 promoter was modified by sitedirected mutagenesis to create a mutant library with varying promoterstrengths. The rate-determining steps of transcription with T7 RNApolymerase are the binding of polymerase to specific T7 promotersequence followed by the melting of the double strand DNA and initiationof transcription with small transcripts. These actions can be mapped tothe different regions of the conserved promoter sequence in FIG. 5. Withthe system in this study, IPTG functions by inducing T7 RNA polymerasesynthesis and relieving the inhibition of the T7 based promoters bybinding to the repressor from the lac operator. To maintain thetunability, the native T7 promoter was selectively disabled at themelting/initiation region as it is an inherent property of the promoterunlike the binding process that can be affected by parameters such aspolymerase and DNA concentrations. Based on the strengths measured by invitro transcription, selected mutant promoters covering variousexpression levels were constructed (Table 1) and their strengths weredetermined by quantifying eGFP protein expression. Strikingly, thestrengths characterized in vivo were different from the published invitro measurement (Table 1). According to the in vivo strengths, threelow leaky T7 mutant promoters, herein named as TM1, TM2 and TM3, werechosen for further studies.

TABLE 1 The strength of mutant T7 promoters Promoter Leaky InducedStrength in No. Name Mutation strength strength literature 1 T7 N.A.6.4%  100%  100% 2 −2 to A, −3 to T 3.6%  104%  42% 3 TM1 −1 to T, −2 toA 2.3%  92% 75% 4 2 to T, −1 to T 1.5%  85% 65% 5 −1 to T, −2 to G <1%74% 58% 6 −2 to A, −4 to G <1% 42% 41% 7 TM2 −2 to G, −3 to C <1% 37%56% 8 −3 to G, −2 to C <1% 26% 46% 9 TM3 +2 to A, +1 to A <1% 16% 13% 10+2 to C, +1 to C <1% 12% 33% 11 +2 to A, +1 to C <1% 10% 19% 12 +1 to T,−1 to G <1% 6.1%  24%

All the mutation positions were labeled according to the sequence inFIG. 4. The leaky and induced (0.3 mM IPTG) expression levels weremeasured by expression of eGFP under control of various promoters in pETvector and normalized to the induced expression level of native T7promoter.

The differences between mutant promoters are solely defined by the rateof melting/initiation which is a first order reaction independent ofother factors. As a result, all mutant promoters (TM1, TM2 and TM3)should have a similar response to IPTG induction. This was validated byexpressing the eGFP as the reporter in pAC vector (FIG. 6A). Accordingto the normalized expression levels, these mutant promoters responded tovarious doses of IPTG equally (FIG. 6B) and retain their relativestrength upon various inductions (FIG. 6C).

Optimization of Lycopene Production with Mutant Promoters

The T7 promoter for SIDF module was replaced with two significantlyweaker promoters TM2 and TM3 to extend the search space. Thetranscriptional result (FIGS. 4E & 4F) showed that the mutant promotershave comparatively low expression in accordance to their strengths (FIG.4G) and no interaction with pBAD promoter was observed. The lycopeneresponse surface switched accordingly but kept with one optimum on thesmooth surface (FIGS. 4B & 4C). The predicted optimum conditionsidentified with different promoters (T7, TM2, TM3) were closely located(FIG. 4G) which proved that reaching a balance between pathway moduleswas a major task for pathway optimization. Evidently, the mutantpromoters extended the coverage of the expression range while theoptimum yield for T7 promoter (56.4 mg/L) was not as high as that of TM2(74.5 mg/L) or TM3 (80.0 mg/L) promoters. A possibility was that toachieve the same expression level, the native promoter expressed thetranscripts faster than the mutant ones which may impose excessivecellular burdens. However, regardless of the strength of the promoters,the expression rates showed similar kinetics in response to induction(FIG. 7). Another issue related to the use of native promoter is thehigh leaky expression that may burden the cells even before induction.Consistent with this suggestion was the significant reduction inisoprenoid production through either DXP or MVA pathway in strains withthe Repressor plasmid (providing lacI protein that suppresses theexpression before IPTG induction) removed so as to enable a constitutiveactivation of the native promoter (FIG. 8).

The library provided promoters with variety of tunable ranges was thenused in conjunction with other independent promoter to optimizemetabolite production in a multivariate manner.

Development of a Univariant Controlling Approach

Because of the limited types and tunable range of independent promotersthat are natural availability, a combinatorial multivariant-modularcontrolling approach is impractical with more than two modules where theexperimental conditions will increase exponentially as well. In anattempt to develop a simplified, robust and rational engineeringapproach, the optimization challenge was dissected into two distinctparts: balance various pathway modules and reduce overexpressionburdens. In order to maximize the flux efficiency and avoid toxic orinhibitory intermediates, a balanced pathway is always criticallyindependent of the overall flux. On the other hand, the overallexpression needs to be optimized to balance flux and burden—a generallimitation caused by high expression regardless of the function of themodule. To address these two distinct yet related challenges (flux andburden), selected promoters from the T7 promoter library was used toalter the relative ratios between various pathway modules by theirstrength. At the same time, the concentration of IPTG, serving as aglobal factor, was used to regulate the expressions of all the modulessimultaneously while maintaining the ratios of promoter strengths (FIG.2B). By tuning these two orthogonal dimensions, this univariantcontrolling approach was able to overcome the limitations of thecombinatorial multivariant controlling with other approaches.

In order to test the hypothesis that the relative strengths or ratios ofthe strengths of these mutant promoters are indeed evenly distributedwhen they were competing for a limited pool of resource, an in vitrotranscription system was established to mimic the circumstancesencountered in vivo. The modules were standardized by expressing eGFPgene with short sequence tags which could be differentiated by specificqPCR primers (FIG. 9A). All combinations of two or three tagged moduleswere mixed with equal amounts in the reactions and the results showedthat the modules with the same promoter but different tags behavedsimilarly, indicative that these sequences were expressed equally(P1-TM1/P2-TM1, P1-TM2/P2-TM2, P1-TM3/P2-TM3) (FIGS. 9B & 9D). Next, itwas obvious that the expression levels of the gene from a weakerpromoter (e.g., P2-TM2 in FIG. 9B), was expectedly lower in the presenceof a strong promoter (P2-TM2/P1-TM1 in FIG. 9B) than when co-transcribedwith a comparable promoter (P2-TM2/P1-TM2 in FIG. 9B) proving theoccurrence of competition at the reaction conditions (high templateconcentration to T7 polymerase availability) (FIGS. 9B & 9D).Re-plotting the data (FIGS. 9C & 9E) it was clear that the relativestrengths of the mutant promoters were fixed, even under competitive,resource limiting conditions. With such constant ratios, depending onmutant promoters, the modules will always have the same occupancies ofthe transcription resource regardless of the experimental conditions.While on the other independent orthogonal dimension, IPTG should stillregulate the overall resource independent of other parameters.

Optimization of Lycopene Production with the Univariant ControllingApproach

To further demonstrate, three promoters with varying strengths (TM1, TM2and TM3, Table 2) were selected to control the expressions of the SIDFand crtEBI modules in a combinatorial way. Firstly, the transcriptionlevels of the modules were measured so as to exam the behavior of theco-existing promoter in vivo. The inducer IPTG (0.3-0.011 mM) was addedto the cells with various combinations of the mutant promoters (e.g,pETK-TM-SIDF TM1 with pAC-TM-crtEBI TM1) in pAC and pET vectors (FIG.10A). All mutant promoters expressing SIDF (pET vector) responded toIPTG similarly regardless of their strengths and the type ofco-expression promoters. When the strongest promoter (TM1) is expressedin a high copy number plasmid (pET, ˜100 copies), the expressions of thegenes (crtEBI in pAC, ˜30 copies) were found to be lower than expectedwhen compared to the other combinations (FIG. 10B) indicative of thelimitation of transcription resource. Importantly, even in thissituation, the relative strengths of mutant promoters remained constant(first 3 sets of TM combinations). A common (univariant) resource isdistributed at fixed ratios over mutant promoters with pre-settranscriptional strengths. Hence, any change in the global supply ofresource will influence the transcription from each mutant promoter in apre-set manner.

Next, the lycopene response using the univariant controlling approachwas investigated. As expected, for any of the strains with variousratios between two modules, a compromising IPTG concentration formaximum lycopene by balancing of burden and flux could always beidentified (FIG. 11A). On the other dimension, with optimal IPTGinduction, lycopene production response to different promoter pairsdiffered (FIG. 11C). In general, crtEBI module required a stronger (TM1or TM2) promoter than SIDF module (TM3) and the yield would be extremelylow in contrary situations. These observations demonstrated theimportance of both dimensions for optimal tuning. Putting the twodimensions together, the tested conditions dispersed well in the wholesearch space (FIG. 11B) the one that was wider than using pBAD promoter(FIGS. 4D-4F) for crtEBI module. Consequently, a slightly higherlycopene yield (102 mg/L) was achieved. Again, a global optimum was beidentified and condition located adjacent it would have higherproduction than those distant ones. The existence of single globalboundary allowed employing a rational optimization approach thatstepwise zoomed into the optimum conditions which would then allow anaccelerated optimization process by reducing the number of strains to beconstructed (FIG. 12).

Simultaneously Optimization of Three Pathway Modules with the UnivariantApproach

In previous studies, four bottle neck steps, scattered throughout theDXP pathway (FIG. 1), was grouped into a single module. It is highlypossible that the optimal expression level for the committed step, dxs,is different from the rest of the intermediate steps (idi-ispD-ispF,IDF). The fact that IDF are the three enzymes in DXP pathway found to behighly soluble upon overexpression as compared to the rest of theenzymes in the pathway suggests that less expressions may be required ofthem. To investigate this issue, the upstream pathway was divided intodxs and IDF modules and another important isoprenoid—amorphadiene, theprecursor for antimalaria drug artemisinin, was synthesized with DXPpathway by changing the crtEBI module to ADS gene encoding amorphadienesynthase. To eliminate the possible biases caused by the variation ofplasmid copy numbers, a library (27 recombinant plasmids) harboring thefull combination of three promoters (TM1, TM2, TM3) with the threemodules were constructed into a single pAC vector with the CLIVA methodand each of these plasmids was transformed into BL21-Gold DE3 strainalong with a pRepressor plasmid critical for the function of IPTG.

Tuning of IPTG, as expected, allowed the identification of optimaloverall expression for each engineered strain (FIG. 13A). To extend thefindings, the system was transferred to another routinely used E. colistrain MG1655 DE3 (K12 strain family) differing from the former B strainderivative BL21-Gold DE3 (FIG. 13B). As expected, the IPTG performedwell on balancing the burden and overall expression for each conditionwhile notable difference could be observed comparing two strains. ForMG655 DE3 strain, the maximum yield was attained at two conditions:pAC-TM2-dxs-TM3-iDF-TM1-ADS with 0.1 mM IPTG (232 mg/L) andpAC-TM3-dxs-TM3-IDF-TM2-ADS with 0.3 mM IPTG (232 mg/L) indicative that,for an optimum production, more ADS than dxs is required while theexpression of IDF should be kept as low as possible, which validated ourprevious hypothesis. On the other hand, the best conditions forBL21-Gold DE3 strain: pAC-TM2-dxs-TM3-IDF-TM2-ADS with 0.3 mM IPTG (281mg/L) and pAC-TM2-dxs-TM2-IDF-TM2-ADS with 0.1 mM IPTG (274 mg/L) showedan equal expression for the dxs and ADS modules.

The measurement of expression levels of the selected strains revealedthat with a low copy pAC vector, the competition for transcriptionalresource did not appear to occur (FIG. 14). On basis of that, therelative expression level (a.u.) of modules controlled by the univariantcontrolling method can be calculated by “Equation 1” where parameterswere fitted by least squares method from the data of “FIG. 6”. Theamorphadiene production corresponding to the relative expressions of the3 modules were then represented in a 3-D graph (FIGS. 15A & 15C).According to the plot, the univariant controlling method systematicallycovered a large space within which neither too low nor too highexpression was propitious to production. The deduced high productionconditions (more than half of the maximum) for both strains (FIGS. 15B &15D) were located at a fairly focused space raising the possibility ofthe existence of a single optimum.

$\begin{matrix}\begin{matrix}{{{Calculation}\mspace{14mu} {of}\mspace{14mu} {relative}\mspace{14mu} {expression}\mspace{14mu} {levels}\mspace{14mu} {in}\mspace{14mu} {arbitrary}\mspace{14mu} {units}}\;} \\{\left( {a.u.} \right)\mspace{585mu}}\end{matrix} & \; \\{{{Relative}\mspace{14mu} {expression}\mspace{14mu} \left( {a.u.} \right)} = {{IPTG}\mspace{14mu} {induction}\mspace{14mu} {strength} \times {Mutant}\mspace{14mu} {promoter}\mspace{14mu} {strength}}} & {{Equation}\mspace{14mu} 1} \\{{{IPTG}\mspace{14mu} {induction}\mspace{14mu} {strength}} = \left\{ \begin{matrix}{100,} & {0.3\mspace{14mu} {mM}} & {IPTG} \\{61.4,} & {0.1\mspace{14mu} {mM}} & {IPTG} \\{33.6,} & {0.033\mspace{14mu} {mM}} & {IPTG} \\{10.5,} & {0.011\mspace{11mu} {mM}} & {IPTG}\end{matrix} \right.} & \; \\{{{Mutant}\mspace{14mu} {promoter}\mspace{14mu} {strength}} = \left\{ \begin{matrix}{100,} & {{TM}\; 1\mspace{14mu} {promoter}} \\{41.9,} & {{TM}\; 2\mspace{14mu} {promoter}} \\{7.9,} & {{TM}\; 3\mspace{14mu} {promoter}}\end{matrix}\; \right.} & \;\end{matrix}$

The pathway modules' expression levels were calculated as the product ofthe mutant promoter strengths and IPTG induction strengths. Based on theexpression level of eGFP under control of mutant promoters and IPTGinductions (FIG. 6), the value of both strengths was estimated withleast square linear optimization. The maximum levels were arbitrarilyassigned as one hundred. As IPTG induction has similar effects ondifferent promoters, multiplication between the two values was used toobtain the relative expression levels.

In an attempt to investigate the global trend at the dimension ofratios, the percentage modules in each construct were calculatedaccording to the strength of mutant promoters and ternary plots wereemployed to illustrate the results. In the plot, each vertex of theequilateral triangle represents a pathway modules and the percentage ofa specific module decreases linearly with increasing distance from itscorner (FIG. 17A) where the color of the dots represented the optimizedyield obtained at IPTG dimension (FIGS. 16A & 16B). The plots showedthat a global optimum existed in both strains but slightly shiftedtowards more ADS in MG165 DE3 strain (FIG. 16B) when compared toBL21-Gold DE3 strain (FIG. 16A). On the whole, the conditionssurrounding the global optimum have a higher yield than the far awayones. This kind of general reverse correlation indicated that the systemhas hit a global boundary, possibly due to the metabolic burden causedby high expression. On the other hand, the presence of several localoptima, especially in BL21-Gold strain, suggested that there were localminor boundaries encountered. The global boundary leading to acontinuous change in the yield throughout the search space allowed therapid optimization of the pathway through a rational approach (FIGS.17B-17D).

Applying the Univariant Controlling Approach for MVA PathwayOptimization

Next, the same approach was utilized to optimize the MVA pathway foramorphadiene production. The pathway was separated in to three modulesSBR (hmgS-aroB-hmgR), KKDI (MVK-PMVK-MVD-idi) and AA (ADS-ispA)according to the order of flux (FIG. 1). The overexpression of ispA wasnecessary when using MVA pathway but not DXP pathway which provideslesser upstream flux so that the endogenously expressed ispA is enoughfor fluxing (data not shown). By tuning the overall expression, anoptimum could be identified for each strain as usual (FIG. 13C).Notably, the MVA pathway was more sensitive to the tuning of promoters(FIG. 13C) comparing to DXP pathway (FIGS. 16A & 16B). The clear optimal(pAC-TM3-SBR-TM2-KKDI-TM3-AA) and two suboptimal(pAC-TM2-SBR-TM1-KKDI-TM3-AA and pAC-TM3-SBR-TM1-KKDI-TM3-AA) revealedthat a higher expression of KKDI module than the SBR module as well as aminimum expression of AA module was critical for the high production ofamorphadiene. The same conclusion could be drawn from the ternary plot(FIG. 16C) where the global optimum located at a small corner area. Theyields in this focused optimum region were distinctly higher than therest of the conditions (the color representation for amorphadiene yieldis in exponential scale).

Extracellular Accumulation of DXP Pathway Intermediates DuringOptimization

The efflux of DXP pathway intermediates when the pathway wasoverexpressed has been discussed. To further investigate theoptimization process, extracellular accumulated metabolites of DXPpathway were measured for the B121-DE3 Gold strain in conditions foramorphadiene production optimization and DXP (1-Deoxy-D-xylulose5-phosphate, product of dxs), MEC ((E)-4-Hydroxy-3-methyl-but-2-enylpyrophosphate, product of ispF) were found to be significantlyaccumulated (FIG. 18A). Reasonably, more DXP were accumulated when astronger promoter (TM1) was applied to the dxs module (FIG. 19B,TM-dxs=1). Examining the responses of various modules to IPTG inductioncarefully, at conditions optimum for amorphadiene production,significantly lesser amounts of DXP accumulated in the medium (FIGS. 19A& 19B compare amorphadiene and DXP). By tuning the overall expressionlevel, an optimum could be identified with minimized leakage of pathwayintermediates. With the rest of the conditions (FIG. 19B, TM-dxs=2, 3),DXP accumulated occasionally without a clear trend indicating thecomplexity of the process.

MEC accumulated to a higher level than DXP. Surprisingly, a similarresponse of MEC and amorphadiene could be found (FIGS. 19A & 19B,compare amorphadiene and DXP). As a secondary product of the pathway dueto the limitation of downstream enzymes, the yield of MEC was wellcorrelated with amorphadiene at all conditions (FIG. 18B) and theternary plot only differed slightly (FIG. 18C). This meant that certainglobal parameter or the upstream part of the pathway (up to ispF) wasthe major limiting factor that was optimized for both products. However,the extra high local area (pAC-TM3-dxs-TM2-IDF-TM2-ADS) observed internary plot of MEC (FIG. 18C) did not exhibit low expression for ADSmodule which had no direct relationship with MEC accumulation indicatingthat the pathway optimization was a interrelated process.

Discussion

To engineer a biological process, the expressions of the related genesare the most commonly and useful method to increase productivity. Forpathway optimization, combinations of different promoters (e.g, lac, T7,T5, BAD etc) and recombinant gene carriers (various plasmids or genome)are widely used in current practices, which is highly unsatisfactory dueto lack of predictability. As a result, most of the studies only managedto vary and optimize one parameter at a time and those bottom-upapproaches reflect no insight in the global status of the systems. Toaddress this with a top-down approach, decomposed of the whole pathwayinto two modules where expressions were separately controlled by wellcharacterized independent tunable promoters were initially carried out.The ease of control of individual inducers allowed the simultaneous andcontinuous alteration of the expressions of both modules and revealed aglobal optimum within the expression range. But this multivariatestrategy is not ideal as most of the naturally tunable promoters inmicrobes used sugars as regulators. The sugar inducers may complicatethe system as they are limited by the transportation system and mayaffect cellular metabolism while any perturbation of the global systemwill have distinct effects on each promoter, raising the difficulty inusing multiple of them simultaneously. Together with the limitation oftheir dynamic range in tuning and the irrational nature of thismultivariate approach, it can be impractical for manipulating multiplemodules.

Rather than treating each module separately, another rational univariantcontrolling method was then developed by decomposing the regulatoryprocess into two orthogonal dimensions: the overall expression level andthe ratios between modules. The modulation of two dimensions wasrealized using a dependent tunable promoter library where promotermembers share the same transcription resource—T7 RNA polymerase and acommon mechanism of action so that the former dimension could beconveniently achieved by varying the availability of the inducer—IPTG.At the same time, mutations were specifically introduced to themelting/initiation region of the promoter making their relativestrengths constant so that the ratio of modules was solely defined bythe cognate promoters. The independence of two dimensions was validatedat conditions when the promoters were used separately or together. Witha wide dynamic range on both dimensions, the method comprehensively andcontinuously covers a board space allowing a systematic search for theoptimum condition of three pathway modules. In addition, a rationalapproach can be applied to accelerate the optimization process,especially with complicated multiple module systems.

As kinetic events and confounded by multiple feedback controls andglobal factors, little is known about the mechanism of pathwayoptimization. The production of pathway enzymes has now been shown toact as burdens to the cell, possibly due to the synthesis of unnecessaryproteins or the formation of inclusion bodies when they were profuselyproduced inside the cell. As a result, an optimum overall expressionlevel could not be consistently predicted by tuning the IPTGconcentration. Examining the other dimension of tuning in a ternaryplot, a clear global optimum existed in all tested systems indicatingthe existence of major bottle necks which were presumably different forvarious systems as the MVA pathway was found to be much more sensitiveto tuning than the DXP pathway. The information gained can serve toguide the identification of novel bottle necks. Further optimization tothe system will no longer involve tuning the expression of the genes butother factors, e.g. strains, growth medium etc. This is importantbecause by knowing the limits, other potential directions can beexplored with confidence. For example, when studies were carried outinitially with BL21-Gold (DE3) strain and later to MG1655 DE3 strain,different locations of the global optimum were identified in the ternaryplot where the optimal values were comparable.

When optimizing DXP pathway for amorphadiene production, the amount ofintermediates released extracellularly responded distinctly to pathwaytuning—MEC had a similar profile as amorphadiene while DXP was inverselycorrelated when the dxs module was highly expressed. An obvious kineticdifference between these may possibly be due to DXP being re-consumed bythe cell but not MEC, which further increase the complexity of theoptimization task. Despite all these confounding mechanisms, theunivariant control method described herein provides a systematic,rational and robust tool for the modulation of multiple genes formetabolic pathway optimization.

Conclusion

A univariant control method was established for the multivariateengineering of pathway modules by tuning two dimensions: the ratiosbetween the modules and the overall expression defined with biologicalprinciples. The tuning of the ratios balanced the activity of pathwayenzymes so as to minimize the accumulation of unwanted intermediates.While the overall expression level is related to metabolic flux andmetabolic burden, the fine tuning balanced these two competingparameters. A well characterized and designed T7 promoter library wasestablished which enabled the orthogonal regulation at these twodimensions.

Comparing to other less systematic methods which attempt to modulatedifferent pathway modules separately, the method described in this paperallowed searching of a broad gene expression space with minimal effort.Moreover, the optimize systems were more tolerant to global andenvironmental changes.

Applying the tools, combinatorial engineering of DXP or MVA pathway forisoprenoids production were carried out. Global optima were identifiedand at these conditions, large enhancements on the yields (>40 fold forDXP pathway and >1000 fold for MVA pathway) were observed.

Methods Bacteria Strains and Plasmids Construction

All the plasmids used in this study were summarized in “Table 2”. Theoriginal vector pBAD-B was purchased from Invitrogen and pET-11a waspurchased from Stratagene. RK2A vector (pJB864) (Blatny, J. M., et al.,“Improved broad-host-range RK2 vectors useful for high and low regulatedgene expression levels in gram-negative bacteria,” Plasmid, 38(1): 35-51(1997)) was required from National BioResource Project (NBRP). All theE. coli genes were cloned from cDNA of E coli. MG1655 strain from ATCCand amorphadiene synthase was codon optimized and synthesized fromGenscript. The CLIVA method was used to generate mutant promoters and tocombine multiple modules for amorphadiene production in to one (pAC)vector. E. coli XL10-Gold strain (Invitrogen) or DH10B strain (NEB) wasused for plasmid construction. E. coli K-12 MG1655 DE3 was fromAjikumar, P. K., et al., “Isoprenoid pathway optimization for Taxolprecursor overproduction in Escherichia coli,” Science, 330(6000): 70-74(2010) and E. coli BL21-Gold (DE3) strain was from Stratagene. Bothstrains carrying T7 RNA polymerase were used for isoprenoid production.

TABLE 2 Plasmids used in this study Part I: Plasmid composition NameVector Promoter Genes pETK pETK T7 non pETK-T7-SIDF pETK T7dxs-idi-ispD-ispF pAC-LYC pAC Constitutive crtE-crtB-crtI pBAD-crtEBIpBAD pBAD crtE-crtB-crtI pAC-BAD-crtEBI pAC pBAD crtE-crtB-crtIpAC-T7-crtEBI pAC T7 crtE-crtB-crtI pAC-T7-ADS pAC T7 ADS pAC-T7-AA pACT7 ADS-ispA pETK-T7-eGFP pETK T7 eGFP pETK-T7-dxs pETK T7 dxspETK-T7-idi pETK T7 idi pETK-T7-IDF pETK T7 idi-ispD-ispF pETK-T7-t-dxspETK T7 N.dxs pETK-T7-t-idi pETK T7 N.idi pETK-TM1/2/3-SIDF pETKTM1/TM2/TM3 dxs-idi-ispD-ispF pAC-TM1/2/3-crtEBI pAC TM1/TM2/TM3crtE-crtB-crtI pETK-TM1/2/3-dxs pETK TM1/TM2/TM3 dxs pAC-TM1/2/3-ADS pACTM1/TM2/TM3 ADS pAC-TM1/2/3-AA pAC TMI/TM2/TM3 ADS-ispA pRepressor pETKConstitutive lacI RK2A-T7-IDF Rk2A T7 idi-ispD-ispF RK2A-TM1/2/3-IDFRK2A TM1/TM2/TM3 idi-ispD-ispF pAC-TM1/2/3-dxs- pAC TM1/TM2/TM3 dxsTM1/2/3-IDF- TM1/TM2/TM3 idi-ispD-ispF TM1/2/3-ADS TM1/TM2/TM3 ADSPETK-T7-SBR pETK T7 hmgS-aroB-hmgR pETK-TM1/2/3-SBR pETK TM1/TM2/TM3hmgS-aroB-hmgR RK2A-T7-KKDI Rk2A T7 MVK-PMVK-MVD-idi RK2A-TM1/2/3-KKDIRk2A TM1/TM2/TM3 MVK-PMVK-MVD-idi pAC-TM1/2/3-SBR- pAC TM1/TM2/TM3hmgS-aroB-hmgR TM1/2/3-KKDI- TM1/TM2/TM3 MVK-PMVK-MVD-idi TM1/2/3-AATM1/TM2/TM3 ADS-ispA pETK- T7-eGFP-tag1 PETK T7 eGFP-tag1 pETK-T7-eGFP-tag2 pETK T7 eGFP-tag2 pETK- T7-eGFP-tag3 pETK T7 eGFP-tag3pETK- TM1/2/3-eGFP-tag1 pETK TM1/TM2/TM3 eGFP-tag1 pETK-TM1/2/3-eGFP-tag2 pETK TM1/TM2/TM3 eGFP-tag2 pETK- TM1/2/3-eGFP-tag3pETK TM1/TM2/TM3 eGFP-tag3 Part II: Plasmid construction NameConstruction pETK Replace the Ampicillin resistance gene of pET-11a withkanamycin resistance gene by ligation pETK-T7-SIDF Inserted into pETKone by one by ligation pAC-LYC From paper [13] pBAD-crtEBI Amplifiedfrom pAC-Lyc and inserted into pBAD-B one by one by ligationpAC-BAD-crtEBI Replace the vector of pBAD-crtEBI with pAC vector byCLIVA method pAC-T7-crtEBI Replace the promoter of pAC-BAD-crtEBI withT7 promoter by ligation pAC-T7-ADS Replace the gene of pAC-T7-crtEBIwith ADS by ligation pAC-T7-AA Replace the gene of pAC-T7-crtEBI withADS and ispA by ligation pETK-T7-eGFP Amplified from pIRES-eGFP andinserted into pETK by ligation pETK-T7-dxs Inserted into pETK byligation pETK-T7-idi Inserted into pETK by ligation PETK-T7-IDF Insertedinto pETK one by one by ligation pETK-T7-t-dxs Remove the RBS and startcodon of pETK-T7-dxs pETK-T7-t-idi Remove the RBS and start codon ofpETK-T7-idi pETK-TM1/2/3-SIDF Modify the promoter of pETK-T7-SIDF byCLIVA method pAC-TM1/2/3-crtEBI Modify the promoter of pAC-T7-crtEBI byCLIVA method pETK-TM1/2/3-dxs Modify the promoter of pETK-T7-dxs byCLIVA method pAC-TM1/2/3-ADS Modify the promoter of pAC-T7-ADS by CLIVAmethod pAC-TM1/2/3-AA Modify the promoter of pAC-T7-AA by CLIVA methodpRepressor Remove the T7 promoter, RBS and T7 terminator of pETKRK2A-T7-IDF Replace the vector of pETK-T7-IDF with RK2A vector by CLIVAmethod RK2A-TM1/2/3-IDF Modify the promoter of RK2A-T7-IDF by CLIVAmethod pAC-TM1/2/3-dxs- Combine the modules amplified from:pETK-TM1/2/3-dxs, pAC-TM1/2/3-ADS TM1/2/3-IDF- and RK2A-TM1/2/3-IDF intopAC vector by CLIVA method TM1/2/3-ADS pETK-T7-SBR The Yeast genes(Saccharomyces cerevisiae) were inserted into pETK one by one byligation pETK-TM1/2/3-SBR Modify the promoter of pETK-T7-SBR by CLIVAmethod RK2A-T7-KKDI The Yeast genes (Saccharomyces cerevisiae) wereinserted into RK2A-T7 one by one by ligation RK2A-TM1/2/3-KKDI Modifythe promoter of RK2A-T7-KKDI by CLIVA method pAC-TM1/2/3-SBR- Combinethe modules amplified from: pETK-TM1/2/3-SBR, pAC-TM1/2/3-AATM1/2/3-KKDI- and RK2A-TM1/2/3-KKDI into pAC vector by CLIVA methodTM1/2/3-AA pETK- T7-eGFP-tag1 Inserted into pETK by ligation. Tag1 wasamplified from crtE. pETK- T7-eGFP-tag2 Inserted into pETK by ligation.Tag2 was amplified from crtE. pETK- T7-eGFP-tag3 Inserted into pETK byligation. Tag3 was amplified from crtE. pETK- TM1/2/3-eGFP- Modify thepromoter of pETK- T7-eGFP-tag1 by CLIVA method tag1 pETK- TM1/2/3-eGFP-Modify the promoter of pETK- T7-eGFP-tag2 by CLIVA method tag2 pETK-TM1/2/3-eGFP- Modify the promoter of pETK- T7-eGFP-tag3 by CLIVA methodtag3

Culture Medium and Growth Conditions

2×PY medium was prepared: peptone 20 g/L, yeast extract 10 g/L and NaCl10 g/L, adjust pH=7.0, autoclaved at 121° C. for 20 mins. An additional10 g/L glycerol (for DXP pathway) or glucose (for MVA pathway), 50 mMHEPES buffer (pH=7.4) and 0.5% Tween 80 was added to 2×PY medium forisoprenoid production. The antibiotics were added at variousconcentrations to maintain the selection: ampicillin (100 mg/L),chloramphenicol (34 mg/L) and kanamycin (50 mg/L). 1% (v/v) of overnightgrown cell culture was inoculated and cells were grown at 28° C. with300 RPM shaking for isoprenoids production. The inducers (L-arabinose orIPTG) were added when the cells' optical density at 600 nm reached therange of 0.6˜0.8. For lycopene production, 1 ml of cells was grown for48 hours in 14 mL BD Falcon™ tube. For amorphadiene, 0.8 ml of cellstogether with 0.2 ml of dodecane were grown for 72 hours in 14 mL BDFalcon™ tube (Newman, J. D., et al., “High-level production ofamorpha-4,11-diene in a two-phase partitioning bioreactor ofmetabolically engineered Escherichia coli,” Biotechnol. Bioeng., 95(4):684-91 (2006)).

Lycopene and Amorphadiene Assay

Intracellular lycopene content was extracted from 20-100 μL (dependingon the content of lycopene in cells) of bacterial culture. The cellpellet was washed for about 30˜40 min and completely resuspended in 100μL D.D. H₂O. 20 μL of suspension was then extracted in 180 μL of acetoneat room temperature for about 15 min with continuous vortexing andcentrifuged at 2,800 g for 3 mins. The lycopene content in thesupernatant was quantified through absorbance at 472 nm by microplatereader (Spectra Max 190, Molecular Devices) and concentrations werecalculated through a standard curve. Amorphadiene was quantified by gaschromatography/mass spectrometry (GC/MS) by scanning of 189 and 204 m/zion, using trans-caryophyllene as internal control and in vitrosynthesized amorphadiene as standard curve.

RNA Purification and cDNA Synthesis

Total RNA from E. coli was prepared using TRIzol® reagent (Invitrogen)according to the manufacturer's instructions. Total RNA was collectedfrom samples in quadruplicate at each treatment time point. RNAconcentration was quantified using a NanoDrop ND-1000 spectrophotometer(Thermo Scientific), and the 260/280 and 260/230 ratios were examinedfor protein and solvent contamination. The integrities of all RNAsamples were confirmed by formaldehyde agarose gel. 200 ng of total RNAwere treated with RQ 1 RNAse-free DNAse (Promega) and reversetranscribed in a total volume of 10 μL containing ImpromII (Promega) for60 min at 42° C. according to the manufacturer's instructions. Thereaction was terminated by heating at 70° C. for 10 min.

Reverse Transcription and Quantitative PCR (RT-qPCR)

The cDNA levels were then analyzed using a BioRad iCycler 4 Real-TimePCR Detection System (Bio-Rad) with SYBR Green I detection. Each samplewas measured in duplicate in a 96-well plate (Bio-Rad) in a reactionmixture (25 μL final volume) containing 1× Xtensa Buffer (bioworks), 200nM primer mix, 2.5 mM MgCl2, 0.75 U of iTaq DNA polymerase (iDNA). qPCRwas performed with an initial denaturation of 3 min at 95° C., followedby 40 cycles of 20 s at 95° C., 20 s at 60° C., and 20 s at 72° C. Theprimers used for real time PCR were given in “Table 3”. And thereference genes used for real time PCR were cysG. The copies of thegenes in cDNA were calculated with a standard curve prepared fromplasmid DNA and presented as copy per copy of cysG.

TABLE 3  qPCR primers used in this study (SEQ ID NOs: 13-28) GeneForward primer Reverse primer dxs CGGCTATCACTATAACGATGCACGACGCTTCACAATGC G crtE GTAAAGCGGGCGTTTCG GCCAGCAGCATCAGC idiTGTATTACACGGTATTGATG AGCTGGGTAAATGCAGATAATC CCACG GTT cysGTTGTCGGCGGTGGTGATGTC ATGCGGTGAACTGTGGAATAAA CG eGFP GACCACTACCAGCAGAACACGACCATGTGATCGCGCTT C tag1 CACGCATCGCAAGGCTGA TGGCTGGCCTGTTACCTGA tag2GGTCAGCCCACTACCCACAA CCCAACGGAGGCAAGGAT tag3 CGTCCTTATTGCGATCTTTACCAGGCGTTTCAACTGCTGG CG

In Vitro Transcription

Different modules (TM1/TM2/TM3-eGFP-tag1/2/3) were amplified fromplasmid. Their concentrations were quantified using a NanoDrop ND-1000spectrophotometer (Thermo Scientific) and in the reactions, the moduleswere added in equal molar. In total, 50 ng of DNA were added into a 5 ulin vitro transcription reaction using T7 RNA polymerase (12.5 u) andrNTP (0.5 mM each) from NEB according to the manufacturer'sinstructions. The reactions were carried out at 37° C. for 2 hours andterminated by adding 50 ul of DEPC treated water with 0.5 mM EDTA. 4 μLof the RNAs were then used for RT-qPCR according the describedprotocols.

Example II Combinatorial Engineering of 1-Deoxy-D-Xylulose 5-PhosphatePathway Using Cross-Lapping In Vitro Assembly (CLIVA) Method

The ability to assemble multiple fragments of DNA into a plasmid in asingle step is invaluable to studies in metabolic engineering andsynthetic biology. Using phosphorothioate chemistry for high efficiencyand site specific cleavage of sequences, a novel ligase independentcloning method (cross-lapping in vitro assembly, CLIVA) wassystematically and rationally optimized in E. coli. A series of 16constructs combinatorially expressing genes encoding enzymes in the1-deoxy-D-xylulose 5-phosphate (DXP) pathway were assembled usingmultiple DNA modules. A plasmid (21.6 kb) containing 16 pathway genes,was successfully assembled from 7 modules with high efficiency (2.0×103cfu/μg input DNA) within 2 days. Overexpressions of these constructsrevealed the unanticipated inhibitory effects of certain combinations ofgenes on the production of amorphadiene. Interestingly, the inhibitoryeffects were correlated to the increase in the accumulation ofintracellular methylerythritol cyclodiphosphate (MEC), an intermediatemetabolite in the DXP pathway. The overexpression of the iron sulfurcluster operon was found to modestly increase the production ofamorphadiene. This study demonstrated the utility of CLIVA in theassembly of multiple fragments of DNA into a plasmid which enabled therapid exploration of biological pathways.

Synthetic biology has provided tools for the design and construction ofbiological systems which enabled the metabolic engineering of cellularpathways for the production of desirable compounds. For an example,bacteria can now be engineered to efficiently produce a class of naturalproducts commonly found in plants—the isoprenoids. Some of these naturalcompounds include high value pharmaceutical products like theantimalarial drug, Artemisinin, and the anticancer drug, Taxol. Toconstruct such bacteria, certain combinations of genes encoding ametabolic pathway are required to be overexpressed. The construction ofsuch genetically engineered collection of strains is challenging. Here,we systematically and rationally developed a new method that allows therapid construction of large recombinant DNAs from multiple fragments ina single step. With the method, the pathway synthesizing precursors forisoprenoids was combinatorially engineered to produce amorphadiene—theprecursor of Artemisinin. This study revealed the unanticipated effectsof certain combinations of genes. The inhibitory effects were furtherfound to be correlated with the intracellular accumulation of anintermediate metabolite and the co-expression of genes supplyingco-factors for the downstream enzymes increased productivity. The methoddescribed herein is invaluable to studies in metabolic engineering andsynthetic biology.

Synthetic biology and metabolic engineering require convenient, robustand universal tools to manipulate genetic materials. As such, a demandis to assemble multiple genetic components including sequences encodingenzymes, functional fusion tags and control elements (promoters,terminators and ribosome binding sites). The commonly used restrictionenzymes and in vitro ligation based sequential cloning methods are oftenlimited by the availability of unique restriction sites and are timeconsuming. Furthermore, single stranded DNA (ssDNA) overhangs generatedby restriction enzymes are typically 2-8 nucleotides which exhibit poorannealing efficiencies and have limited use in assembling multiple largeDNA fragments in a single step.

To address these challenges, several sequence independent methods,generating long ssDNA overhangs or using double stranded PCR productswith long homologous sequences, have been developed for the assembly oflarge DNA inserts into vectors. Only a few of these approaches havereported the assembly of multiple (>3) DNA fragments in a single step.Methods such as the T4 DNA polymerase based sequence andligation-independent cloning (SLIC), phosphorothioate-basedligase-independent gene cloning (PLICing) and others have onlydemonstrated the construction of plasmids of less than 8 kb. Variousattempts have been made to meet the increasing demand to assembleseveral large fragments of DNA inserts into plasmids of >10 kb. Aisothermal in vitro assembling method with synthetic oligonucleotideswas used to assemble a 16.3 kb construct from seventy-five fragments ofDNAs and the assembly of a 24 kb plasmids from four separate fragments.In addition, using yeast in vivo recombination system, a 582 kbMycoplasma genitalium genome was constructed from synthetic DNAoligonucleotides in several steps. The yeast system has also beensuccessfully used for the one step assembly of a 19 kb fragments into aplasmid or yeast chromosome. With these examples, homologous overhangsequences with lengths of 100-500 base pairs were required to increasethe assembly efficiency. This can be a significant challenge wheresuitable pre-existing sequences in the parental or chemicallysynthesized templates are required which can restrict the applicabilityand incur high-cost of synthesis. Furthermore, these approaches are alsotime consuming and labor intensive, hence, are not suited for routinecloning projects.

This example describes the development of a reliable, scalable androbust cloning method (cross-lapping in vitro assembly, CLIVA) for therapid construction of large recombinant DNA from multiple fragments in asingle step. This approach exploits the unique properties ofphosphorothioate modified nucleotides where highly efficient and sitespecific cleavage is achieved using iodine in an ethanolic solution(Nakamaye, K. L., et al., “Direct sequencing of polymerase chainreaction amplified DNA fragments through the incorporation ofdeoxynucleoside alpha-thiotriphosphates,” Nucleic Acids Res., 16:9947-9959 (1988); Gish, G., and Eckstein, F., “DNA and RNA sequencedetermination based on phosphorothioate chemistry,” Science, 240:1520-1522 (1988)). Recently, the use of such phosphorothioate chemistrywas demonstrated for the assembly of multiple small protein domains(Blanusa, M., et al., “Phosphorothioate-based ligase-independent genecloning (PLICing): An enzyme-free and sequence-independent cloningmethod,” Anal. Biochem., 406: 141-146 (2010); Marienhagen, J., et al.,P″hosphorothioate-based DNA recombination: an enzyme-free method for thecombinatorial assembly of multiple DNA fragments,” Biotechniques, 0: 1-6(2012)). Unique to the CLIVA method is a novel cross-lapping designwhich allows the generation of long homologous overhang sequences (36-38bases) by cleavage of optimally positioned phosphorothioate modifiednucleotides and the use of selective cations resulting in a highlyefficient assembling process. To demonstrate the utility of this method,we constructed 16 plasmids of 7.8 kb to 21.6 kb in size, encodingvarious combinations of genes in the 1-Deoxy-D-xylulose 5-phosphate(DXP) pathway in E. coli. To our knowledge, this is the first report ofthe successful assembly of large constructs containing multiple genesusing an enzyme independent in vitro method to engineer multi-enzymepathways in a short duration.

Isoprenoids are a large and diverse class of natural products (more than55,000) derived from five-carbon isoprene units. Some are fragrances,insecticides, nutraceuticals and pharmaceuticals, while the functions ofthe vast majority of the isoprenoids remain to be determined. Due to thestructural complexities of many of these compounds, e.g., Artemisininand Taxol, de novo total chemical synthesis is impractical. Metabolicengineering of microbes is a promising alternative and has beenintensively explored by manipulating the 1-deoxy-D-xylulose-5-phosphate(DXP) or the mevalonate (MVA) pathway. The DXP pathway displays a morebalanced redox utility as compared to the MVA pathway in vivo. In E.coli, a few empirically selected enzymes (dxs, idi, ispD, ispF) arethought to be the limiting steps in the DXP pathway and increasing theexpression levels of these enzymes have been shown to improve isoprenoidproduction.

In this study, the effects of various combinations of the enzymes in theDXP pathway in providing precursors to downstream production ofamorphadiene, the precursor for antimalarial drug artemisinin (Liu, C.,et al., “Artemisinin: current state and perspectives forbiotechnological production of an antimalarial drug,” Appl. Microbiol.Biotechnol., 72: 11-20 (2006)), was systematically investigated for thefirst time (FIG. 22A). The CLIVA method enabled the assembly of multipleplasmids containing various combinations of genes rapidly. Metabolicprofiling using ultra-performance liquid chromatography massspectrometry (UPLC-MS) (Zhou, K., et al., “Metabolite profilingidentified methylerythritol cyclodiphosphate efflux as a limiting stepin microbial isoprenoid production,” PLoS One, 7: e47513 (2012))identified the accumulation of intracellular MEC (one of the DXP pathwayintermediate) as a limiting factor for isoprenoid production. Theoverexpression of iron sulfur cluster (Isc) operon, which supplied thecofactors for the function of two succeeding enzymes downstream of MEC(ispG and ispH) (FIG. 22A), was found to modestly enhance the productionof amorphadiene.

Results Design of CLIVA

PCR has been used to produce overlapping homologous sequences by addingextraneous tag sequences to the gene specific primers. With such adesign, the homologous sequences are limited to the length of the tags.In order to increase the assembly efficiency, we designed the tags to behomologous to the gene specific sequences (FIG. 20A). This cross-lappingdesign allowed us to increase the length of the homologous sequences ateach junction as compared to conventional strategies. Besides, otherthan modifying all the bases in the homologous sequences which increasedthe cost of primer synthesis, we explored the possibility of decreasingthe modification frequency (number of phosphothiodate modification peroligonucleotide) while maintaining a high efficiency of assembly (FIG.20A). By the use of certain cations, the efficiency of the assemblyprocess was substantially increased and this has enabled theconstruction of large plasmids from multiple fragments in one step.

In order to demonstrate the utility of this method, we constructed aseries of plasmids carrying multiple genes of a metabolic pathway. Asshown in FIG. 20B, all the pathway modules as well as a vector modulecontaining the origin of replication and antibiotic resistant gene werefirst amplified from the parental plasmids using a pair of cross-lappingprimers and subsequently treated with a solution of ethanolic iodine asdescribed in “MATERIAL AND METHODS”. The assembly was then carried outin the optimal condition with equal molar of each DNA module fragment(see below).

Optimization of CLIVA

The construction of a 7.1 kb PAC-SIDF plasmid was initially used as amodel for identifying suitable designs and optimal conditions for CLIVA.The PAC-SIDF plasmid was generated by combining two modules amplifiedfrom different sources: the PAC vector (2.8 kb) consisting of P15Aorigin of replication and chloramphenicol resistant gene (FIG. 22B) froma pre-existing pAC-lyc plasmid and SIDF module (4.3 kb) containing four1-Deoxy-D-xylulose 5-phosphate (DXP) pathway enzymes (dxs, idi, ispD,ispF, FIG. 22A) from a pre-existing pET-dxs-idi-ispDF plasmid (Tyo, K.E., et al., “Stabilized gene duplication enables long-termselection-free heterologous pathway expression,” Nat. Biotechnol., 27:760-765 (2009)). All the primers used in the optimization process werelisted in Table 5 where the PAC-F/PAC-R and SIDF-F/SIDF-R were the genespecific sequences targeting at pAC-lyc plasmid and pET-dxs-idi-ispDFplasmid.

Ionic strength affects DNA hybridization (Lang, B. E., and Schwarz, F.P., “Thermodynamic dependence of DNA/DNA and DNA/RNA hybridizationreactions on temperature and ionic strength,” Biophys. Chem., 131:96-104 (2007)). As cations can reduce charge repulsion between thenegatively charged phosphodiester backbones of double stranded DNA, wesought to investigate the assembly efficiency in relation to theconcentrations of MgCl₂ or NaCl. The assembly efficiency increaseddramatically with the addition of salts and the divalent cation (Mg2+)resulted in much higher enhancement (FIG. 21A). With respect to Na+,there was a positive correlation between the ionic concentration and theassembly efficiency. With Mg2+, a decrease in the assembly efficiencywas observed at high concentrations. A limitation in using highconcentrations of salts (NaCl or MgCl₂) was that these reaction mixtureswere incompatible with the use of electroporation for transformation.This proposal was consistent with the observation of the severesuppression of transformation efficiency at high MgCl2 concentration(62.5 mM) (FIG. 21B). Thus, the optimum MgCl₂ concentration wasidentified as 2.5 mM. We also tested other divalent ions (CuCl₂, CaCl₂,and CoCl₂) and found that Ca2+ acted similarly to Mg2+, while Co2+ andCu2+ were found to be significantly poorer (FIG. 26). This was possiblydue to the toxicity of Co2+ and Cu2+ ions at high concentrations.

Existing methods that generate ssDNA with phosphorothioate chemistryhave every base of the overlap sequence chemically modified, which iscost prohibitive for long overlapping sequences (Blanusa, M., et al.,“Phosphorothioate-based ligase-independent gene cloning (PLICing): Anenzyme-free and sequence-independent cloning method,” Anal. Biochem.,406: 141-146 (2010); Marienhagen, J., et al., “Phosphorothioate-basedDNA recombination: an enzyme-free method for the combinatorial assemblyof multiple DNA fragments,” Biotechniques, 0: 1-6 (2012)). Wehypothesized that it was unnecessary to cleave the overlapping sequenceinto single bases; instead, by cleaving the nucleotide at severaldiscrete sites into smaller fragments, the assembly should work equallywell. We then tested this hypothesis using four types of 12-13 basesoverlap designs: O12-13/1, O12-13/4-5, O12-13/6-7 and O12-13/12-13 withdifferent positions of the sequences modified with phosphorothioatewhere the modifications at positions were 1 base apart, 4-5 bases apart,6-7 bases apart or 12-13 bases apart, respectively (Table 5).Unexpectedly, amplification using O12-13/1 primer pairs (modificationinserted at every base) yielded extremely low amount of amplicon and wasnot used for further studies. The exact reason for this pooramplification is currently unknown. Nonetheless, the O12-13/4-5 designwas successfully amplified showed a high assembly efficiency. A slightlylower assembly efficiency was observed when using the O12-13/6-7 designand even lesser still with the O12-13/12-13 design (FIG. 21C). It isworthy to note that with the O12-13/12-13 design where a singlemodification was incorporated, the cleavage resulted in a fragment ofthe DNA which was identical to the overlap sequence and hence, may havecompeted for annealing. So this arrangement would result in a lowerefficiency in assembly, consistent with the observation in FIG. 21C.Increasing the modification frequency greater than one in 4-5 basesapart did not substantially improve the efficiency of assembly ascompared to one in 6-7 bases.

Another critical parameter for the assembly of multiple DNA fragments isthe length of the overlaps that determines the specificity as well asthe efficiency of the annealing. As predicted, when compared to shortoverlaps (12-13 bases), the assembly efficiency increased with longeroverlapping segments (36-38 bases) by as much as 3 fold (FIG. 21D). Withthe increasing number of pathway modules to assemble, it is critical tohave high assembly efficiency at each junction.

Extending the study, the assembling efficiencies of designs with only asingle phosphorothioate modification (O12-13/12-13, O24-25/24-25 andO36-38/36-38) were examined (FIG. 27). With this arrangement, the designwith longer overlap sequences after cleavage (O24-25/24-25 where theoverlap was 24-25 bases) showed lower efficiency of assembly than ashorter one (O12-13/12-13 where the overlap was 12-13 bases). Inaddition, an even longer overlap (the O36-38/36-38 design where theoverlap was 36-38 bases) was even poorer. Thus, with singlephosphorothioate modification, the efficiency of assembly was related tothe length of the cleaved product whereby the fragmented pieces of DNAshould be short so as not to interact with the overlap sequences. Thus,the 036-38/4-5 design was suitable for the assembly of multi-componentswith high efficiency, while the O12-13/12-13 design was sufficientlyefficient and cost effective, replacing the use of restriction enzymeand ligation based method for routine tasks.

Constructions of Plasmids Using CLIVA Method

Next, we used the CLIVA method to assemble a series of plasmidsconsisting of various combinations of modules containing the genes ofthe 1-Deoxy-D-xylulose 5-phosphate (DXP) pathway (Rohmer, M., “Thediscovery of a mevalonate-independent pathway for isoprenoidbiosynthesis in bacteria, algae and higher plants,” Nat. Prod Rep., 16:565-574 (1999)) and for amorphadiene production (FIG. 22A). In addition,two operons, ISC (iron-sulfur cluster (Isc) operon) and SUF (sulfurmobilization (Suf) operon), containing the proteins necessary for Fe—Scluster (Py, B., and Barras, F., “Building Fe—S proteins: bacterialstrategies,” Nat. Rev. Microbiol., 8: 436-446 (2010)) assembly in E.coli were also constructed (FIG. 22A). Details of the modules and theirabbreviations were presented in FIG. 22B. Fragments of treated DNAs weremixed and transformed into E. coli for the one step assembly of thesegenes (FIG. 20B) and the correct clones were identified by quantitativecolony PCR as described in “MATERIAL AND METHODS”. With each construct,two randomly selected positive clones were further confirmed byrestriction mapping and at least one of these was verified bysequencing. The sequencing results covered all the sequences encodingthe junctions (the overlap sequence between the modules) as well as morethan 50% of the sequences in the plasmid. No change in the sequences wasobserved, indicative of the high fidelity of amplification and highspecificity of cleavage. As expected, the efficiency decreased withincreasing number of fragments (Table 4). However, even with the largestplasmid (21.6 kb, S-R-DEF-GH-ISC-IAA-PAC plasmid from 6 modules)assembled, the efficiency was reliably high (˜2.0×10³ cfu/μg input DNA).The false positive colonies resulting in lower accuracy of assembly werelargely due to the existence of plasmids with incomplete pathway modules(demonstrated by quantitative colony PCR and restriction mapping, datanot shown).

Overexpression of GH and R-DEF Inhibited Amorphadiene Production

Next, the various combinations of pathway genes with the essentialmodule (IAA) containing the heterologous amorphadiene synthase weretested for amorphadiene production. High induction resulted in lowerproduction of isoprenoids (FIG. 23A, different IPTG inductions).Comparing constructs at their optimal induction levels, as expected, theexpression of the first committed step (dxs—module S) enhanced theamorphadiene production. However, the overexpression of the rest of thepathway genes in conjunction with the S and IAA modules had variablenegative effects on productivity. Notably, the expression of GH module(ispG and ispH) as well as R (dxr)-DEF (ispD, ispE and ispF) modules ledto a significant inhibition on the production (FIG. 23A). Consistentwith the observations, a simple linear model correlating the pathwaymodules and amorphadiene yields at their optimal inductions revealedthat the expression of GH module or the co-expression of R-DEF moduleshad negative impacts (FIG. 23B).

In order to investigate the changes in the levels of intracellularmetabolic intermediates with the overexpression of the various modules,cells were harvested after 3 h of induction and the metabolites werequantified by UPLC-MS (FIG. 23C). The induction of the expression of thegenes in any of the modules resulted in significant accumulation ofintracellular MEC, indicative of a limitation in metabolite conversionwith genes downstream, an observation in congruence with our previousobservations. Interestingly, the overexpression of GH module did notfully convert MEC to the downstream metabolite IPP/DMAPP. Instead themetabolite HMBPP accumulated in all strains where the GH module wasoverexpressed (FIG. 23C, the second row). Other than that, the genes inthe pathway upstream of MEC were functionally expressed as theaccumulations of the metabolites were positively correlated with theexpressed genes. Hence, the overexpression of dxs, the first andcommitted step in the DXP pathway, resulted in the accumulation of DXP(FIG. 23C, S-IAA). Similarly, the overexpression of dxs and dxr resultedin the accumulation of MEP (FIG. 23C, S-R-IAA) and the co-expression ofS-R-DEF resulted in the high accumulation of MEC (FIG. 23C,S-R-DEF-IAA). Besides, higher expressions of these genes resulted in theparallel increases in activities (higher concentrations of accumulatedintermediates).

Accumulation of Intracellular MEC was Inversely Correlated toAmorphadiene Productivity

In order to further investigate the pathway, a kinetic study measuringthe concentrations of intracellular, extracellular DXP metabolites andamorphadiene was carried out with strains harboring different modules.As expected, the induction of dxs resulted in a significant increase inthe level of intracellular DXP in the strain with S-IAA modules (FIG.24A, S-IAA|DXP). Curiously, extracellular level of DXP was alsoincreased substantially albeit with different kinetics (FIG. 24B,S-IAA|DXP). Similarly, the expression of the S-R-IAA modules resulted inthe accumulation of both intracellular and extracellular MEP (FIGS. 24A& 24B, S-R-IAA|MEP). With all three modules, MEC accumulatedintracellularly and significantly more with the S-R-DEF-IAA modules.Intriguingly, the extracellular levels of MEC accumulated to similarlevels and were inversely correlated to the inducer concentrations instrains carrying any of the three modules (FIG. 24B, MEC). The inversecorrelation of metabolite levels with the inducer concentration used wasalso observed with the production of amorphodiene. The S-R-DEF-IAA-PACstrain accumulated large quantities of intracellular MEC and yieldedmuch less amorphodiene as compared to strains harboring the S-IAA orS-R-DEF-IAA modules (FIG. 24B, MEC). Although high IPTG inductionsyielded higher concentrations of intracellular intermediates initially(FIG. 24A, first 10 h), the relationship was reversed at later timepoints, especially with the highest induction (0.1 mM IPTG) (FIG. 24A,highly accumulated intermediates). Other metabolites (CDP-ME, IPP/DMAPP,GPP, FPP) were found to be accumulated at insignificant levels.

Overexpression of Fe—S Operons Modestly Increased AmorphodieneProductivity

An attempt was made to increase the activities of ispG and ispH (GHmodule) in converting MEC to the downstream metabolite IPP/DMAPP so asto increase amorphodiene production. As the essential cofactor for thesetwo enzymes, the genes in the iron-sulfur (Fe—S) cluster pathways(iron-sulfur cluster (Isc) operon—iscS, isCU, iscA, hscB, hscA, fdx)and/or sulphur mobilization (Suf) operon (SUF module (surA, surB, surC,surD, surS, surE) (Py, B., and Barras, F., “Building Fe—S proteins:bacterial strategies,” Nat. Rev. Microbiol., 8: 436-446 (2010); Py, B.,et al., “Fe—S clusters, fragile sentinels of the cell,” Curr. Opin.Microbiol., 14: 218-223 (2011)) were assembled using CLIVA andtransformed into E. coli. Disappointingly, the overexpression of eitheroperon together with S-IAA modules not only did not enhance but insteadinhibited the production of amorphodiene (FIG. 25, 1-3 columns). Theoverexpression of Isc operon in other constructs together with GH moduleshowed modest enhancements (FIG. 25, 4-8 columns).

Discussion

This study demonstrated the rapid assembly of large plasmids with anarray of metabolic genes (21.6 kb plasmid with 16 genes) using aligation independent cloning (CLIVA) method. These recombinant plasmidswere then used to systematically investigate the effects of the variouscombinations of the enzymes in the DXP pathway in producingamorphadiene, the precursor for antimalarial drug artemisinin (FIG.22A). Metabolic profiling using ultra-performance liquid chromatographymass spectrometry (UPLC-MS) (Zhou, K., et al., “Metabolite profilingidentified methylerythritol cyclodiphosphate efflux as a limiting stepin microbial isoprenoid production,” PLoS One, 7: e47513 (2012))identified the accumulation of intracellular MEC (one of the DXP pathwayintermediate) as a potential negative contributor to isoprenoidproduction. The overexpression of the Isc operon, which supplied thecofactor for the function of two succeeding enzymes downstream of MEC(ispG and ispH) (FIG. 22A), was found to modestly increased theproduction of amorphadiene.

The manipulation of genetic material is a fundamental and routinerequirement for engineering of biological systems where multiple genesare assembled and used to produce downstream products. The traditionalin vitro ligation based cloning methods are sequence-dependent and areoften not efficient in assembling multiple fragments of DNAs.Consequently, these limitations have been addressed with methods thatassemble multiple DNA fragments with overlapping homologous sequences ina single step. Such in vitro assembling method or the yeast in vivohomolog recombination based DNA assembler method uses enzymes withexonuclease activities to generate ssDNA and other enzymes to repair theover-treated non-homologous ssDNA gaps. The use of multiple enzymes doesnot only incur cost but is also inefficient and time consuming. Based onthe phosphorothioate chemistry that allows cleavage of DNA at specificsites, the enzyme-free CLIVA method provides robust performance for theone-step assembly of multiple DNA modules. Typically, the constructioncan be completed within 1-2 days, as compared to the more involvedmethod of yeast recombination (1-2 weeks).

The novel design of the cross-lapping PCR primer pair (˜40 bases)enabled high efficiency of amplification by PCR and efficient assemblyof multiple DNA fragments. Unlike other studies, we found thatphosphothioate modifications of every 4-5 bases intervals in thehomologous sequences was sufficient to enable efficient cleavage andassembly of the sequences. The use of cations at optimal concentrationwas found to significantly enhance the assembly efficiency whilemaintaining high transformation efficiency. Even with a singlephosphothioate modification, the assembly of two pieces of DNA fragments(˜3-4 kb each) was highly efficient (˜2.0×10⁶ cfu/μg input DNA). Thiswas far superior to the use of restriction enzymes and ligase (<10⁴cfu/μg input DNA for the same construct) in parallel studies. Hence, theCLIVA method can replace all routine recombinant DNA constructions withthe use of just a single phosphothioate modification in each primer. Theassembly of the 21.6 kb plasmid (S-R-DEF-GH-ISC-IAA-PAC) from 6fragments of DNAs was sufficiently efficient (˜2.0×10³ cfu/μg input DNA)and was completed in less than 2 days.

With constructs encoding multiple genes under the control of the sameregulatory elements (T7 promoters and terminators), there were largeamount of repeated sequences (200-300 bps) in regions between modules.As those perfect repeats may randomly anneal with each other duringassembly, it was not surprising that the assembly of such multipleidentical sequences resulted in numerous false positive clones whichcontained partially assembled sequences, an observation confirmed byquantitative colony PCR and restriction analysis. The use of the sameregulatory elements to control multiple modules is predicted to be evenmore challenging for recombination based methods which are known toselectively rearrange repeated sequences in vivo (Shao, Z., et al., “DNAassembler, an in vivo genetic method for rapid construction ofbiochemical pathways,” Nucleic Acids Res., 37:e16 (2009)).

The S-R-DEF-IAA-PAC strain resulted in lesser yield of amorphadiene ascompared to the other strains (S-IAA-PAC or S-R-IAA-PAC) which encodefewer numbers of genes in the pathway. The overexpressions of this poorperforming construct resulted in transient accumulations of high levelsof intracellular MEC but yet showed similar extracellular levels withthe other modules. The inverse relationship of the levels ofintracellular MEC and the downstream metabolite productivity suggests aninhibitory role of MEC in regulating isoprenoid production, possibly dueto the increase in oxidative stress in the cell. Recently, MEC was alsoidentified as a signaling molecule that induces stress-responsive genesin plant (Xiao, Y., et al., “Retrograde signaling by the plastidialmetabolite MEcPP regulates expression of nuclear stress-response genes,”Cell, 149: 1525-1535 (2012)), consistent with an involvement in stressresponse. Whether such stress response mechanism occurs in these strainsremains to be determined.

The overexpression of module (GH) containing ispG and ispH resulted inthe accumulation of HMBPP and yet did not increase amorphodieneproduction as would have been anticipated. A possibility is thelimitation in the co-factor system (Py, B., and Barras, F., “BuildingFe—S proteins: bacterial strategies,” Nat. Rev. Microbiol., 8: 436-446(2010); Py, B., et al., “Fe—S clusters, fragile sentinels of the cell,”Curr. Opin. Microbiol., 14: 218-223 (2011)) which involved theiron-sulfur cluster an observation consistent with a recent report in S.cerevisiae (Carlsen, S., et al., “Heterologous expression andcharacterization of bacterial 2-C-methyl-D-erythritol-4-phosphatepathway in Saccharomyces cerevisiae,” Appl. Microbiol. Biotechnol.(2013)). The co-expression of Isc operon did enhance the production ofamorphadiene production but the yield was significantly lower than instrain overexpressing the S-IAA modules. Modest enhancement was observedwhen the GH module was co-expressed with ISC module. Fine tuning ofthose genes (ispG, ispH, iscS, isCU, iscA, hscB, hscA, fdx) includingcontrolling the expression levels and additional combinations can beused to increase the flux of intracellular MEC.

Given the need to construct multiple vectors, the CLIVA method describedherein provides a rapid, effective and efficient approach to identifycombinations of genes useful for the production of metabolites. In thisstudy, we found that the overexpression of related pathway genes may notsimply enhance but may unpredictably inhibit downstream metaboliteproduction. Given the complexity of cellular regulatory pathways andexperimental conditions, a systematic approach to identify optimalcombinations of genes for high yield production will necessitate theconstruction of arrays of recombinant plasmids using the CLIVA methoddescribed herein.

Materials and Methods Reagents, Growth Medium and Bacteria Strain

Restriction enzymes were purchased from NEB. The high fidelity DNApolymerase (IPROOF™) from Bio-Rad was used to amplify the DNA fragmentsfor assembly and the ITAQ™ DNA polymerase from iDNA was used forquantitative colony PCR. Unless stated otherwise, all chemicals werepurchased from either Sigma or Merck. Peptone and yeast extract werepurchased from BD. Oligonucleotides were purchased from AITbiotech.Unmodified oligonucleotides were purified by desalting and thephosphorothioate modified oligonucleotides were purified with cartridge.All the cells for plasmid construction were grown in 2×PY media or 2×PYagar plates containing: peptone (20 g/L), yeast extract (10 g/L) andNaCl (10 g/L) with or without agar (7.5 g/L). The E. coli XL10-Goldstrain (Invitrogen) was used for plasmid construction. Theelectroporation competent cells were prepared: 1 L of XL10-Gold cells atOD600˜=0.4, washed for three time with equal volume of 10% coldglycerol, suspended in 10 ml of cold 10% glycerol and stored at −80° C.For amorphadiene production, the E. coli B121-Gold DE3 strain(Stratagene) harboring different kinds of DXP pathway plasmid togetherwith the pRepressor plasmid carrying the lac repressor gene was culturedin production medium: peptone 20 g/L, yeast extract 10 g/L, NaCl 10 g/L,glycerol 20 g/L, HEPES 50 mM and Tween 80 5 g/L. The pRepressor plasmidwas constructed by removing the T7 promoter, RBS and T7 terminator ofpET-11a (Stratagene) plasmid and replacing the antibiotic resistant(ampicillin) with kanamycin. All the culture contained 34 mg/Lchloramphenicol and 100 mg/L kanamycin to maintain the DXP pathwayplasmid and pRepressor plasmid respectively. The cell density wasdefined by absorbance at 600 nm (OD600) and measured by SpectraMax 190microplate reader. For amorphadiene production, 1% (v/v) cell culture ofovernight grown cell culture was inoculated into 0.8 ml productionmedium together with another 0.2 ml organic dodecane phase to extractamorphadiene in 14 mL BD FALCON™ tube. The dodecane phase contained 1g/L trans-caryophyllene as internal standard for amorphadiene. Cellswere grown at 37° C. with 300 rpm shaking for 2 h when OD600 reached therange of 0.5-0.8 and induced by different concentrations of isopropylβ-D-1-thiogalactopyranoside (IPTG). After induction, the cell wasincubated at 28° C. with 300 rpm shaking for the rest of the experiment.The induction time was considered as the zero time point in the study.

Quantitative Colony PCR

The quantitative colony PCR was carried out to test the presence ofsuccessful ligations at all the junctions of constructed plasmids usingthe primers listed in Table 8. For example, to confirm the S-GH-IAA-PACplasmid, the junctions of PAC-S, S-GH and GH-IAA were verified byquantitative colony PCR respectively. For each junction, the senseprimer in the upstream module and antisense primer in the downstreammodule were used as a pair to perform the real-time quantitative PCR,which were dxs-1609F/ispG-329R, ispH-693F/ADS-941R and PAC-seqF/dxs-122Rpairs respectively. For quantitative colony PCR, the overnight culturedcolonies were suspended in 100 μl of water. The real-time quantitativePCR reactions were carried out in 25 μl final volume containing 5 μl ofcell suspension, 1× Xtensa Buffer (Bioworks), 200 nM of each primer, 2.5mM MgCl2 and 0.75 U of iTaq DNA polymerase (iDNA). The reactions wereanalyzed using a BioRad ICYCLER 4™ Real-Time PCR Detection System(Bio-Rad) with SYBR Green I detection and the following protocol: aninitial denaturation of 10 min at 95° C. to lyse the cells, followed by40 cycles of 30 s at 95° C., 30 s at 60° C., and 1 min at 72° C. A meltcurve was then carried out to check the melting temperature of theamplicon. Various primer pairs were selected from Table 8 to measuredifferent module linkages in all the selected colonies. The results witha Ct number earlier than 18 and correct melting temperature wererecognized as positive.

Plasmid Assembling by CLIVA Method

The primers for CLIVA optimization studies are listed in Table 5 and forDXP pathway assembling are listed in Table 6. The design details for allthe 16 constructed plasmids are listed in Table 7. The modulescontaining various DXP pathway genes (dxs, dxr, ispD, ispE, ispF, ispG,ispG, idi, ispA or iron-sulfur (Fe—S) biosynthesis pathway (Isc operon,Suf operon), FIG. 22B) were amplified from the source plasmidsconstructed by placing those genes between T7 promoter and T7 terminatorin pET-11a plasmid from Stratagene. The genomic DNA purified from MG1655DE3 (ATCC) strain was used as original source for E. coli genes. The ADSfrom Artemisia annua was codon optimized for bacteria expression (FIG.28, SEQ ID NO: 29). All the genes inside each module have their ownribosome binging sites (RBS). The PAC vector was amplified from pAC-Lycplasmid from previous study (Cunningham, F. X., Jr., et al., “Molecularstructure and enzymatic function of lycopene cyclase from thecyanobacterium Synechococcus sp strain PCC7942,” Plant Cell, 6:1107-1121 (1994)). The amplified DNA fragments were purified and treatedwith 20 U DpnI at 37° C. for one hour. After that, 100 mM Tris-HCL at pH9, 0.3% (v/v) iodine and 10% (v/v) ethanol were supplied to thereactions and the mixtures were heated at 70° C. for 5 min. If themixture turned out to be colorless, additional 0.3% (v/v) iodine and 10%(v/v) ethanol would be supplied and the mixture would be heated at 70°C. for another 5 min. The DNA fragments treated with iodine and ethanolwere then purified by ethanol precipitation. For CLIVA optimizationexperiments, 0.15 pmol of every pieces together with different kinds andconcentrations of salts were heated at 80° C. for 1 min, cooled down tothe temperature which was 3 degree lower than the melting temperature ofthe overlapped sequences, kept for 10 min and then cooled down to 20° C.at 0.1° C./s. 0.5 μl of the assembling mixture was mixed with 50 ml ofXL10-Gold competent cell for electroporation. For DXP pathway assemblingexperiments, all the DNA fragments were prepared at 0.25 μM and equalamount of every pieces were mixed with MgCl₂ at 2.5 mM. The mixture wereheated at 80° C. for 1 min, cooled down to 68° C., kept for 10 min andthen cooled down to 20° C. at 0.1° C./s. 0.5 μl of the assemblingmixture was mixed with 50 μl of XL10-Gold competent cell forelectroporation.

Metabolite Measurement

Amorphadiene was trapped in the dodecane phase and quantified aspreviously described (Tsuruta, H., et al., “High-level production ofamorpha-4,11-diene, a precursor of the antimalarial agent artemisinin,in Escherichia coli,” PLoS One, 4: e4489 (2009)). The dodecane phase wasdiluted 100 times in ethyl acetate and the amorphadiene was quantifiedby Agilent 7890 gas chromatography/mass spectrometry (GC/MS) by scanning189 and 204 m/z ions, using trans-caryophyllene as standard. Theamorphadiene concentrations were adjusted to the volume of cellsuspension (0.8 ml) for report.

The DXP pathway intermediates (DXP, MEP, CPD-ME, CDP-MEP, MEC, HMBPP,IPP, DMAPP, GPP, FPP, FIG. 22A) were quantified by UPLC-MS as described(Zhou, K., et al., “Metabolite profiling identified methylerythritolcyclodiphosphate efflux as a limiting step in microbial isoprenoidproduction,” PLoS One, 7: e47513 (2012)). For extracellular metabolites,the growth medium was diluted 30 times in methanol, shaken at roomtemperature for 2 min and centrifuged at 20,000 g for 5 min to yield thesupernatant as the sample for injection. For intracellular metabolites,1 ml×OD600 cell was collected and the medium was removed withcentrifugation. The cell pellet was then suspended in 30 μl of water,120 μl of methanol was added afterwards and the mixture was shaken atroom temperature for 10 min to lyse the cells and release theintermediates (Rabinowitz, J. D., and Kimball, E., “Acidic acetonitrilefor cellular metabolome extraction from Escherichia coli,” Anal. Chem.,79: 6167-6173 (2007)). The cell debris was removed by centrifugation at20,000 g for 5 min. 5 μl of either extracellular or intracellular samplewas injected. Aqueous solution containing 15 mM acetic acid and 10 mMtributylamine and methanol were used as mobile phase with a UPLC C18column (Waters CSH C18 1.7 μm 2.1×50 mm). The elution was done at 0.15mL/min with gradient. A standard curve following the same treatment wasused to quantify the extracellular or intracellular metabolites. Thedetection limit was at least 5 μM in the final sample for FPP, CDP-MEPand at least 1 μM in the final sample for the rest of the metabolites.

TABLE 4 Construction efficiency of the DXP pathway plasmids using CLIVAmethod Transformation efficiency (×10³ Number of cfu/μg Size pieces toinput Accuracy Plasmids (kb) assemble DNA) (%)* IAA-PAC 6.2 2 3612.8100.0 S-IAA-PAC 8.7 2 1052.8 100.0 S-R-IAA-PAC 10.5 3 78.8 93.5S-DEF-IAA-PAC 11.3 3 61.3 96.8 S-GH-IAA-PAC 11.3 3 46.6 83.9S-R-DEF-IAA-PAC 13.1 4 13.2 42.6 S-R-GH-IAA-PAC 13.1 4 15.6 38.3S-DEF-GH-IAA-PAC 13.9 4 9.0 27.7 S-R-DEF-GH-IAA- 15.6 5 5.1 12.7 PACS-ISC-IAA-PAC 14.2 3 15.4 25.5 S-SUR-IAA-PAC 14.7 3 17.4 21.3S-GH-ISC-IAA-PAC 16.8 4 4.9 14.1 S-GH-SUR-IAA-PAC 17.2 4 4.5 11.3S-R-GH-ISC-IAA-PAC 18.5 5 3.1 9.9 S-R-GH-SUR-IAA- 19.0 5 2.6 7.0 PACS-R-DEF-GH-ISC- 21.6 6 2.0 8.5 IAA-PAC *More than 30 colonies for eachconstruct were analyzed by quantitative colony PCR for the accuracycalculation.

TABLE 5 Primers used for CLIVA optimization Cross Primer lapping DesignName primer Sequence PAC-F GGACAGAGAGTGGAACCAACCG PAC-RGCCAAGTAGCGAAGCGAGCAG siDF-F TGCGACTCCTGCATTAGGAAGC siDF-RTCCCCGAAAAGTGCCACCTG O12- O13/1- O12/1-T*C*T*G*T*C*C*T*C*C*C*C*GAAAAGTGCCACCTG 13/1 PAC-F siDF-R O12/1- O12/1-A*G*T*C*G*C*A*G*C*C*A*A*G*TAGCGAAGCGAGCAG PAC-R siDF-F O13/1- O13/1-C*T*T*G*G*C*T*G*C*G*A*C*T*CCTGCATTAGGAAGC siDF-F PAC-R O12/1- O13/1-G*G*G*G*A*G*G*A*C*A*G*A*GAGTGGAACCAACCG siDF-R PAC-F O12- O13/4- O12/4-TCTGT*CCTC*CCC*GAAAAGTGCCACCTG 13/4-5 5-PAC- 5-siDF- F R O12/4- O13/4-AGTCG*CAGC*CAAG*TAGCGAAGCGAGCAG 5-PAC- 5-siDF- R F O13/4- O12/4-CTTGG*CTGC*GACT*CCTGCATTAGGAAGC 5-siDF- 5-PAC- F R O12/4- O13/4-GGGG*AGGA*CAGA*GAGTGGAACCAACCG 5-siDF- 5-PAC- R F O12- O13/6- O12/6-TCTGTCC*TCCCC*GAAAAGTGCCACCTG 13/6-7 7-PAC- 7-siDF- F R O12/6- O13/6-AGTCGCA*GCCAAG*TAGCGAAGCGAGCAG 7-PAC- 7-siDF- R F O13/6- O12/6-CTTGGCT*GCGACT*CCTGCATTAGGAAGC 7-siDF- 7-PAC- F R O12/6- O13/6-GGGGAG*GACAGA*GAGTGGAACCAACCG 7-siDF- 7-PAC- R F O12- O13/13- O12/12-TCTGTCCTCCCC*GAAAAGTGCCACCTG 13/12- PAC-F siDF-R 13 O12/12- O13/13-AGTCGCAGCCAAG*TAGCGAAGCGAGCAG PAC-R siDF-F O13/13- O12/12-CTTGGCTGCGACT*CCTGCATTAGGAAGC siDF-F PAC-R O12/12- O13/13-GGGGAGGACAGA*GAGTGGAACCAACCG siDF-R PAC-F O24- O24/4- O24/4-CCAC*TCTC*TGTC*CTCC*CCGA*AAAG*TGCCACCTG 25/4-5 5-PAC- 5-siDF- F R O25/4-O25/4- TGCAG*GAGT*CGCA*GCCA*AGTA*GCGA*AGCGAGCAG 5-PAC- 5-siDF- R FO25/4- O25/4- TCGC*TACTT*GGCT*GCGA*CTCC*TGCA*TTAGGAAGC 5-siDF- 5-PAC- FR O24/4- O24/4- CTTT*TCGG*GGAG*GACA*GAGA*GTGG*AACCAACCG 5-siDF- 5-PAC- RF O24- O24/24- O24/24- CCACTCTCTGTCCTCCCCGAAAAG*TGCCACCTG 25/24- PAC-FsiDF-R 25 O25/25- O25/25- TGCAGGAGTCGCAGCCAAGTAGCGA*AGCGAGCAG PAC-RsiDF-F O25/25- O25/25- TCGCTACTTGGCTGCGACTCCTGCA*TTAGGAAGC siDF-F PAC-RO24/24- O24/24- CTTTTCGGGGAGGACAGAGAGTGG*AACCAACCG siDF-R PAC-F O36-O38/4- O38/4- GTGG*CACTTT*TCGG*GGAG*GACAG*AGAGT*GGAA*CCAA*CCG 38/4-55-PAC- 5-siDF- F R O36/4- O36/4-TCCTAA*TGCAG*GAGTC*GCAG*CCAA*GTAGC*GAAG*CGAG*CAG 5-PAC- 5-siDF- R FO36/4- O36/4- CTCG*CTTCG*CTACT*TGGCT*GCGA*CTCCT*GCATT*AGGA*AGC 5-siDF-5-PAC- F R O38/4- O38/4- TTGGTT*CCAC*TCTCT*GTCC*TCCC*CGAAA*AGTG*CCAC*CTG5-siDF- 5-PAC- R F O38/36- O38/38- O38/38-GTGGCACTTTTCGGGGAGGACAGAGAGTGGAACCAA*CCG 38 PAC-F siDF-R O36/36- O36/36-TCCTAATGCAGGAGTCGCAGCCAAGTAGCGAAGCGAG*CAG PAC-R siDF-F O36/36- O36/36-CTCGCTTCGCTACTTGGCTGCGACTCCTGCATTAGGA*AGC siDF-F PAC-R O38/38- O38/38-TTGGTTCCACTCTCTGTCCTCCCCGAAAAGTGCCAC*CTG siDF-R PAC-F Thephosphorothioate modifications were presented as *. The PAC-F, PAC- R,siDF-F and siDF-R were the gene specific sequences. An “Ox/y”designation was used to define the primers, where O denoted overlap; xwas the length of overlap which had one modification at each y basepairs of the sequence. For example, O13/1 was a primer with 13 bases ofoverlap and phosphorothioate modifications at every base-pair.Similarly, O13/4 denoted a primer with 13 overlaps and phosphorothioatemodifications at every 4^(th) base-pair. Sequences are SEQ ID NOs: 30 to65.

TABLE 6 Primers used for DXP pathway construction Cross lapping Nameprimer Sequence CL- CL- CTCG*CTTCG*CTACT*TGGCT*GCGA*CTCCT*GCATT*AGGA*AGCpET- pAC-R 1F CL- CL-pET- CCGC*AAGAG*GCCC*GCAGT*AGTAG*GTTGA*GGCC*GTTGApET- aR 2F CL- CL-pET- GTACC*GGCA*TAACC*AAGCC*ACCG*CCGC*CGC*AAGG*AATpET- bR 3F CL- CL-pET- CTACA*GCATC*CAGG*GTGA*CCCT*GCCA*CCATA*CCCA*CGCpET- cR 4F CL- CL-pET- CGAG*GATGA*CGATG*AGCG*TGAGC*CCGA*AGTG*GCG*AGCpET- dR 5F CL- CL-pET- CTGAC*TGCG*TTAGC*AATTTA*ACAGC*AACC*GCAC*CTGT*GGCpET- eR 6F CL- CL-pET- AGAC*GAAAG*GGCC*TCGG*ATGC*GTCC*GGCG*TAGA*GGA pET-fR 7F CL- CL-pET- GTGG*CACTTT*TCGG*GGAG*GACAG*AGAGT*GGAA*CCAA*CCG pAC-FgR CL- CL-pET- TCCTAA*TGCAG*GAGTC*GCAG*CCAA*GTAGC*GAAG*CGAG*CAG pAC-R 1FCL- CL-pET- GGCC*TCAAC*CTACT*ACTGC*GGGC*CTCTT*GCGG*GATA pET- 2F aR CL-CL-pET- CCTTG*CGGC*GGCG*GTGG*CTTG*GTTAT*GCCG*GTAC*TGC pET- 3F bR CL-CL-pET- TGGG*TATGG*TGGC*AGGG*TCACC*CTGGA*TGCT*GTAG*GCA pET- 4F cR CL-CL-pET- CGCC*ACTTC*GGGC*TCACG*CTCA*TCGT*CATC*CTCG*GCA pET- 5F -dR CL-CL-pET- ACAGG*TGCG*GTTGC*TGTTA*AATTG*CTAAC*GCAG*TCAG*GCA pET- 6F eR CL-CL-pET- TCTACG*CCGG*ACGCA*TCCG*AGGC*CCTTT*CGTCT*TCA pET- 7F fR CL- CL-TTGGTT*CCAC*TCTCT*GTCC*TCCC*CGAAA*AGTG*CCAC*CTG pET- pAC-F gR Thephosphorothioate modifications were presented as *. And the underlinedsequences were the gene specific sequences of the primers. Sequences areSEQ IDS NOs: 66 to 81.

TABLE 7 Design details for DXP pathway construction Modules Symbol S RDEF GH IAA ISC SUF PAC IAA-PAC Genes T7-dxs T7-dxr T7-ispE- T7-ispG-T7-ADS- T7-iscS- T7-sufA- PAC T7-ADS- ispD-ispF ispH ispA-idi iscU-iscA-sufB-sufC- ispA-idi- hscB-hscA- sufD-sufS- PAC fdx sufE Template pET-dxspET-dxr pET-DEF pET-GH pET-IAA pET-ISC pET-SUF pAC-lyc IAA-PAC PlasmidsPrimers for used to amplify the modules IAA-PAC — — — — CL-pET-1F — —CL-PAC-F — CL-pET-gR CL-PAC-R S-IAA-PAC CL-pET-1F — — — — — — —CL-pET-7F CL-pET-fR CL-PAC-R S-R-IAA-PAC CL-pET-1F CL-pET-2F — — — — — —CL-pET-7F CL-pET-aR CL-pET-fR CL-PAC-R S-DEF-IAA-PAC CL-pET-1F —CL-pET-3F — — — — — CL-pET-7F CL-pET-bR CL-pET-fR CL-PAC-R S-GH-IAA-PACCL-pET-1F — — CL-pET-4F — — — — CL-pET-7F CL-pET-cR CL-pET-fR CL-PAC-RS-R-DEF-IAA-PAC CL-pET-1F CL-pET-2F CL-pET-3F — — — — — CL-pET-7FCL-pET-aR CL-pET-bR CL-pET-fR CL-PAC-R S-R-GH-IAA-PAC CL-pET-1F —CL-pET-3F CL-pET-4F — — — — CL-pET-7F CL-pET-bR CL-pET-cR CL-pET-fRCL-PAC-R S-DEF-GH-IAA- CL-pET-1F CL-pET-2F — CL-pET-4F — — — — CL-pET-7FPAC CL-pET-aR CL-pET-cR CL-pET-fR CL-PAC-R S-R-DEF-GH- CL-pET-1FCL-pET-2F CL-pET-3F CL-pET-4F — — — — CL-pET-7F IAA-PAC CL-pET-aRCL-pET-bR CL-pET-cR CL-pET-fR CL-PAC-R S-ISC-IAA- CL-pET-1F — — — —CL-pET-5F — — CL-pET-7F PAC CL-pET-dR CL-pET-fR CL-PAC-R S-SUR-IAA-PACCL-pET-1F — — — — — CL-pET-6F — CL-pET-7F CL-pET-eR CL-pET-fR CL-PAC-RS-GH-ISC- CL-pET-1F — — CL-pET-4F — CL-pET-5F — — CL-pET-7F IAA-PACCL-pET-cR CL-pET-dR CL-pET-fR CL-PAC-R S-GH-SUR-IAA- CL-pET-1F — —CL-pET-4F — — CL-pET-6F — CL-pET-7F PAC CL-pET-cR CL-pET-eR CL-pET-fRCL-PAC-R S-R-GH-ISC- CL-pET-1F — CL-pET-3F CL-pET-4F — CL-pET-5F — —CL-pET-7F IAA-PAC CL-pET-bR CL-pET-cR CL-pET-dR CL-pET-fR CL-PAC-RS-R-GH-SUR- CL-pET-1F — CL-pET-3F CL-pET-4F — — CL-pET-6F — CL-pET-7FIAA-PAC CL-pET-bR CL-pET-cR CL-pET-eR CL-pET-fR CL-PAC-R S-R-DEF-GH-ISC-CL-pET-1F CL-pET-2F CL-pET-3F CL-pET-4F — CL-pET-5F — — CL-pET-7FIAA-PAC CL-pET-aR CL-pET-bR CL-pET-cR CL-pET-dR CL-pET-fR CL-PAC-R

TABLE 8 Primers used to check the constructions withquantitative colony PCR, SEQ ID NOs: 82 to 97 Name Position Sequencedxs-1609F S, sense CCGCTTGATGAAGCGTTAATTCTGG dxs-122R S, antisenseGGAACGGCTCACGCTGT dxr-704F R, sense AAGGTCTGGAATACATTGAAGC dxr-782RR, antisense CACTGCCGTCCTGATAGC ispF-220F DEF, sense TTAAAGGTGCCGATAGCCispE-349R DEF, antisense ATTGCCAGAGATGATTTAATGC ispH-693F GH, senseCTCCAACTCCAACCGTCTG ispG-329R GH, antisense ACGCTCTTCATTACCGATATTGCidi-462F IAA, sense TGTATTACACGGTATTGATGCCACG ADS-941R IAA, antisenseGCTTTGGTGAAGAATACGCGAGCA PAC-seqF PAC, sense CCTGCTCGCTTCGCTACT PAC-seqRPAC, antisense GCGGTGCGGACTGTTG FDX-89F ISC, sense CTCTGCGTAACGGTATCGiscS-601R ISC, antisense ACATCAGGTCAACTTTCAACT surfA-334R SUF, senseTCTGGGCTTTAGGGTTGT surfE-273F SUF, antisense GATGACGCCGCAGGATAT

It should be understood that for all numerical bounds describing someparameter in this application, such as “about,” “at least,” “less than,”and “more than,” the description also necessarily encompasses any rangebounded by the recited values. Accordingly, for example, the descriptionat least 1, 2, 3, 4, or 5 also describes, inter alia, the ranges 1-2,1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5, and 4-5, et cetera.

For all patents, applications, or other reference cited herein, such asnon-patent literature and reference sequence information, it should beunderstood that it is incorporated by reference in its entirety for allpurposes as well as for the proposition that is recited. Where anyconflict exits between a document incorporated by reference and thepresent application, this application will control. All informationassociated with reference gene sequences disclosed in this application,such as GeneIDs or accession numbers (typically referencing NCBIaccession numbers), including, for example, genomic loci, genomicsequences, functional annotations, allelic variants, and reference mRNA(including, e.g., exon boundaries or response elements) and proteinsequences (such as conserved domain structures) as well as chemicalreferences (e.g. Pub Chem compound, Pub Chem substance, or Pub ChemBioassay entries, including the annotations therein, such as structuresand assays et cetera) are hereby incorporated by reference in theirentirety.

Headings used in this application are for convenience only and do notaffect the interpretation of this application.

Preferred features of each of the aspects provided by the invention areapplicable to all of the other aspects of the invention mutatis mutandisand, without limitation, are exemplified by the dependent claims andalso encompass combinations and permutations of individual features(e.g. elements, including numerical ranges and exemplary embodiments) ofparticular embodiments and aspects of the invention including theworking examples. For example, particular experimental parametersexemplified in the working examples can be adapted for use in theclaimed invention piecemeal without departing from the invention. Forexample, for materials that are disclosed, while specific reference ofeach various individual and collective combinations and permutation ofthese compounds may not be explicitly disclosed, each is specificallycontemplated and described herein. Thus, if a class of elements A, B,and C are disclosed as well as a class of elements D, E, and F and anexample of a combination of elements, A-D is disclosed, then even ifeach is not individually recited, each is individually and collectivelycontemplated. Thus, is this example, each of the combinations A-E, A-F,B-D, B-E, B-F, C-D, C-E, and C-F are specifically contemplated andshould be considered disclosed from disclosure of A, B, and C; D, E, andF; and the example combination A-D. Likewise, any subset or combinationof these is also specifically contemplated and disclosed. Thus, forexample, the sub-group of A-E, B-F, and C-E are specificallycontemplated and should be considered disclosed from disclosure of A, B,and C; D, E, and F; and the example combination A-D. This conceptapplies to all aspects of this application including, elements of acomposition of matter and steps of method of making or using thecompositions.

The forgoing aspects of the invention, as recognized by the personhaving ordinary skill in the art following the teachings of thespecification, can be claimed in any combination or permutation to theextent that they are novel and non-obvious over the prior art—thus tothe extent an element is described in one or more references known tothe person having ordinary skill in the art, they may be excluded fromthe claimed invention by, inter alia, a negative proviso or disclaimerof the feature or combination of features.

While this invention has been particularly shown and described withreferences to example embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

What is claimed is:
 1. An expression vector, comprising: at least afirst coding region and a second coding region; the first coding regionencoding at least a first gene product, the first coding region beingoperably linked to a first inducible promoter, the first induciblepromoter being of a first strength and being responsive to an inducer;and the second coding region encoding at least a second gene product,the second coding region being operably linked to a second induciblepromoter, the second inducible promoter being of a second strength,different from the first strength, and being responsive to the inducer.2. The expression vector of claim 1, further including a third codingregion encoding at least a third gene product, the third coding regionbeing operably linked to a third inducible promoter, the third induciblepromoter being of a third strength, different from the first strengthand the second strength, and being responsive to the inducer.
 3. Thevector of claim 1, wherein: the first coding region encodes at least afirst enzyme, the first enzyme catalyzing a first reaction in amulti-step enzymatic pathway; and the second coding region encodes atleast a second enzyme, the second enzyme catalyzing a second reaction inthe multi-step enzymatic pathway.
 4. The vector of claim 3, wherein themulti-step enzymatic pathway is the lycopene synthetic pathway or theamorphadiene synthetic pathway.
 5. The expression vector of claim 1,wherein the first and the second inducible promoters are each aderivative of a single RNA polymerase promoter.
 6. The expression vectorof claim 5, wherein the derivative is an RNA polymerase promoter thatincludes a mutation in a region selected from a melting region or aninitiation region.
 7. The expression vector of claim 6, wherein the RNApolymerase promoter is selected from a T7 RNA polymerase promoter, a T5RNA polymerase promoter, a T3 RNA polymerase promoter, or an SP6 RNApolymerase promoter.
 8. A cell transfected with the vector of claim 1.9. (canceled)
 10. A kit, comprising at least two expression vectors, thefirst expression vector comprising a coding region encoding at least afirst gene product, the coding region being operably linked to a firstinducible promoter, the first inducible promoter being of a firststrength and being responsive to an inducer; and the second expressionvector comprising a coding region encoding at least a second geneproduct, the coding region being operably linked to a second induciblepromoter, the second inducible promoter being of a second strength,different from the first strength, and being responsive to the inducer.11. The kit of claim 10, wherein: the coding region of the firstexpression vector encodes at least a first enzyme, the first enzymecatalyzing a first reaction in a multi-step enzymatic pathway; and thecoding region of the second expression vector coding region encodes atleast a second enzyme, the second enzyme catalyzing a second reaction inthe multi-step enzymatic pathway.
 12. A method of expressing at least afirst coding region and a second coding region in a cell, the methodcomprising: providing a cell comprising an expression vector of claim 1comprising at least the first coding region and the second codingregion, wherein: the first coding region is operably linked to a firstinducible promoter, the first inducible promoter being of a firststrength and being responsive to an inducer, the second coding region isoperably linked to a second inducible promoter, the second induciblepromoter being of a second strength, different from the first strength,and being responsive to the inducer; and contacting the cell with theinducer, thereby expressing the first coding region and the secondcoding region.
 13. The method of claim 12, wherein the first codingregion encodes at least a first enzyme, the first enzyme catalyzing afirst reaction in a multi-step enzymatic pathway; and the second codingregion encodes at least a second enzyme, the second enzyme catalyzing asecond reaction in the multi-step enzymatic pathway.
 14. The method ofclaim 12, wherein the expression vector further comprises a third codingregion, the third coding region being operably linked to a thirdinducible promoter, the third inducible promoter being of a thirdstrength, different from the first strength and the second strength, andbeing responsive to the inducer.
 15. A method of expressing at least afirst coding region and a second coding region in a cell, the methodcomprising: providing a cell comprising at least a first expressionvector comprising at least the first coding region encoding a first geneproduct, and at least a second expression vector comprising at least thesecond coding region encoding a second gene product, wherein: the firstcoding region is operably linked to a first inducible promoter, thefirst inducible promoter being of a first strength and being responsiveto an inducer, the second coding region is operably linked to a secondinducible promoter, the second inducible promoter being of a secondstrength, different from the first strength, and being responsive to theinducer; and contacting the cell with the inducer, thereby expressingthe first coding region and the second coding region.
 16. A method ofoptimizing yield of a product of a multi-step enzymatic pathway in ahost cell, the multi-step enzymatic pathway including at least a firstreaction catalyzed by a first enzyme, and a second reaction catalyzed bythe second enzyme, the method comprising: determining optimal levels ofexpression of the first and the second enzymes; determining the ratio ofa strength of a first inducible promoter to a strength of a secondinducible promoter, the ratio of the strengths corresponding to theoptimal levels of expression of the first and the second enzymes, thefirst and the second promoters being responsive to the same inducer; andconstructing an expression vector of claim 3 comprising: a first codingregion encoding the first enzyme, the first coding region being operablylinked to the first inducible promoter; and a second coding regionencoding the second enzyme, the second coding region being operablylinked to the second inducible promoter.
 17. The method of claim 16,further including contacting the host cell with the inducer to induceexpression of the first and the second enzymes.
 18. The method of claim16, further including: determining an optimal level of expression of athird enzyme, the third enzyme catalyzing a third reaction in themulti-step enzymatic pathway; determining the ratio of the strengths ofthe first inducible promoter to the second inducible promoter, to athird inducible promoter, the ratio of the strengths corresponding tothe optimal levels of expression of the first enzyme, the second enzyme,and the third enzyme, the first, the second, and the third promotersbeing responsive to the same inducer; and constructing an expressionvector comprising: the first coding region encoding the first enzyme,the first coding region being operably linked to the first induciblepromoter; the second coding region encoding the second enzyme, thesecond coding region being operably linked to the second induciblepromoter; and a third coding region encoding the third enzyme, the thirdcoding region being operably linked to the third inducible promoter. 19.A method of gene cloning, comprising: contacting each of a vector and aset of inserts with a pair of first terminal primers, a pair of secondterminal primers, and at least one pair of linking primers, wherein: theset of insets including at least a first and a second insert, theinserts in the set of inserts including at least a first coding regionand a second coding region, each of the first terminal primers includesa first region complementary to a region of the vector and a secondregion complementary to a region of a first insert, each of the secondterminal primers includes a first region complementary to a region ofthe vector and a second region complementary to a region of an insertdifferent from the first insert, each of the linking primers includes afirst region complementary to a region of an insert in the set ofinserts and a second region complementary to a region of a differentinsert in the set of inserts, and wherein each primer includes at leastone phosphorothioate internucleotide linkage; amplifying the vector andat least two inserts to produce a vector amplification product and atleast two insert amplification products, each including at least onephosphorothioate internucleotide linkage; non-enzymatically cleaving thevector amplification product and the at least two insert amplificationproducts at the at least one phosphorothioate internucleotide linkage toproduce complementary single-stranded overhangs; annealing the vectoramplification product and the at least two insert amplification productsin the presence of a cation and thereby non-enzymatically assembling atransforming product; and introducing the transforming product into ahost cell.
 20. The method of claim 19, wherein the set of insertsincludes at least one additional insert comprising at least oneadditional coding region, further including: contacting the at least oneadditional insert with a pair of linking primers; amplifying the atleast one additional insert to produce at least one additional insertamplification product; non-enzymatically cleaving the at least oneadditional insert amplification product at the at least onephosphorothioate internucleotide linkage to produce complementarysingle-stranded overhangs; annealing the vector amplification product,the at least two insert amplification products, and the at least oneadditional insert amplification product in the presence of a cation tonon-enzymatically assemble the transforming product.
 21. The method ofclaim 19, wherein: the complementary single-stranded overhangs are atleast 14 basepairs long; the phosphorothioate internucleotide linkage isrepeated every two or more nucleotides, and annealing the vectoramplification product and the at least two gene amplification productsis performed in at least about 0.5 mM of a cation selected from Mg²⁺,Ca²⁺, Co²⁺, Cu²⁺, or a combination thereof.